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Preface 



This volume contains the proceedings of the 12th International Conference on 
Rewriting Techniques and Applications (RTA 2001), which was held May 22-24, 
2001 at Utrecht University in The Netherlands. RTA is the major forum for the 
presentation of research on all aspects of rewriting. Previous RTA conferences 
were held in Dijon (1985), Bordeaux (1987), Chapel Hill (1989), Como (1991), 
Montreal (1993), Kaiserslautern (1995), Rutgers (1996), Sitges (1997), Tsukuba 
(1998), Trento (1999), and Norwich (2000). 

There were 55 submissions from Argentina (|), Australia (1), France (12|), 
Germany (ll|), Israel (l|), Italy (2), Japan (8|), The Netherlands (6), Slovakia 
(^), Spain (4), UK (2|), USA (3), and Venezuela (1), of which the program 
committee selected 23 regular papers and 2 system descriptions for presentation. 
In addition, there were invited talks by Arvind (Rewriting the Rules for Chip 
Design), Henk Barendregt (Computing and Proving), and Michael Rusinowitch 
(Rewriting for Deduction and Verification). 

The program committee awarded the best paper prize to Jens R. Woinowski 
for his paper A Normal Form for Church-Rosser Language Systems. In this paper 
the surprising and important result is shown that all Church-Rosser languages 
can be defined by string rewrite rules of the form uvw — >■ uxw with v being 
nonempty and x having a maximum length of one. 

Many people helped to make RTA 2001 a success. I am grateful to the mem- 
bers of the program committee and the external referees for reviewing the sub- 
missions and maintaining the high standards of the RTA conferences. It is a 
particular pleasure to thank Vincent van Oostrom and the other members of 
the local organizing committee for organizing an excellent conference in a rather 
short period. Finally, I thank the organizers of the four events that collocated 
with RTA 2001 for making the conference even more attractive: 

— 4th International Workshop on Explicit Substitutions: Theory and Applica- 
tions to Programs and Proofs (Pierre Lescanne), 

— 5th International Workshop on Termination (Nachum Dershowitz), 

— International Workshop on Reduction Strategies in Rewriting and Program- 
ming (Bernhard Gramlich and Salvador Lucas), 

— IFIP Working Group 1.6 on Term Rewriting (Claude Kirchner). 
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Computing and Proving 



Henk Barendregt 

University of Nijmegen 
henkOcs . kun . nl 



Abstract. Computer mathematics is the enterprise to represent sub- 
stantial parts of mathematics on a computer. This is possible also for 
arbitrary structures (with non-computable predicates and functions), as 
long as one also represents proofs of known properties of these. In this 
way one can construct a ‘Mathematical Assistant’ that verifies the well- 
formedness of definitions and statements, helps the human user to de- 
velop theories and proofs. 

An essential part of the enterprise consists of a reliable representation 
of computations /(a) = h, say for a, b in some concrete set A. We will 
discuss why this is so and present two reliable ways to do this. One 
consists of following the trace of the computation in the formal system 
used to represent the mathematics ‘from the outside’. The other way 
consist of doing this ‘from the inside’, building the assistant around a 
term rewrite system. The two ways will be compared. 

Other choices in the design of a Mathematical Assistant are concerned 
with the following qualities of the system 

1. reliability; 

2. choice of ontology; 

3. choice of quantification strength; 

4. constructive or classical logic; 

5. aspects of the user interface. 

These topics have been addressed by a number of ‘competing’ projects, 
each in a different way. From many of these systems one can learn, but a 
system that is satisfactory on all points has not yet been built. Enough 
experience through case studies has been obtained to assert that now 
time is ripe for building a satisfactory system. 
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Rewriting for Deduction and Verification 



Michael Rusinowitch 

LORIA — INRIA 
615, me du Jardin Botanique 
BP 101, 54602 Villers-les-Nancy Cedex France 
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Abstract. Rewriting was employed even in early theorem- proving sys- 
tems to improve efficiency. Theoretical justihcations for its completeness 
were developed through work on Knuth-Bendix completion and conver- 
gent reduction systems, where the key-notions of termination and con- 
fluence were identihed. A first difficulty with reduction in automated 
theorem-proving is to ensure termination of simplification steps when 
they are interleaved with deduction. Term orderings have been widely 
investigated and provide us with good practical solutions to termination 
problems. A second difficulty is to keep completeness with unidirectional 
use of equations, both for deduction and simplihcation, which amounts 
to restoring a confluence property. This is obtained by extending the 
superposition rule of Knuth-Bendix procedure to first-order clauses. 
Rewrite-based deduction has found several applications in formal verifi- 
cation. We shall outline some of them in the presentation. In computer- 
assisted verification, decision procedures are typically applied for elimi- 
nating trivial subgoals (represented, for example, as sequents modulo a 
background theory). The computation of limit sets of formulas by iter- 
ating superposition rules generalizes Knuth-Bendix completion and per- 
mits the uniform design of these decision procedures. Rewriting combined 
with a controlled instantiation mechanism is also a powerful induction 
tactic for verifying safety properties of infinite-state systems. A nice fea- 
ture of the so-called rewriting induction approach is that it allows for the 
refutation of false conjectures too. A more recent application of rewriting 
to verification concerns security protocols. These protocols can be com- 
piled to rewrite systems, since rewriting nicely simulates the actions of 
participants and malicious environments. 
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Abstract. In the framework of interaction nets Yves Lafont has 
proved |H| that every interaction system can be simulated by a system 
composed of 3 symbols named 7 , 5 and e. One may wonder if it is possi- 
ble to find a similar universal system with less symbols. In this paper, we 
show a way to simulate every interaction system with a specific interac- 
tion system constituted of only 2 symbols. By transitivity, we prove that 
we can find a universal interaction system with only 2 agents. Moreover, 
we show how to find such a system where agents have no more than 3 
auxiliary ports. 



1 Introduction 

In jSj, Yves Lafont introduces interaction nets, a programming paradigm inspired 
by Girard’s proof nets for linear logic Some translations from A-calculus 
into interaction nets PUSj or from proof nets 1711 0121111 II show that universal 
interaction systems are interesting for computation. We can explain this interest 
for these translations by the fact that computation with interaction nets is purely 
local and naturally confluent. Reductions can be made in parallel. Moreover, the 
number of steps that are necessary to reduce completely a net is independent of 
the way one may choose. From the point of view of A-calculus, translations used 
in nil captures optimal reduction. 

In |2|, Lafont introduces a universal interaction system with only three dif- 
ferent symbols 7 , 6 and e. S and e are respectively a duplicator and an eraser and 
7 is a constructor. This system preserves the complexity of computation for a 
particular system. The number of steps that are necessary to reduce a simulated 
interaction net is just (at most) multiplied by a constant (which depends only 
on the simulated system and not on the size of the simulated net). 

One may wonder if it is possible to find a simpler universal interaction system 
with only 2 symbols. This paper answers yes to this question. In fact, we prove 
that we can simulate a particular interaction system with only two symbols. By 
simulating a universal system, we prove that a universal system constituted of 
only two symbols exists. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. .S-THI 2001. 
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Using the universal system in 0, the resulting universal system has two 
symbols, one is an eraser and the second is a constructor/rule encoder. The 
eraser has no auxiliary port but the second one has 16 auxiliary ports! In fact, 
we show a way to reduce this number to only three auxiliary ports. 

This paper is organized as follows: after an introduction to interaction nets 
and interaction systems, the notions of translations, simulations and universal 
interaction systems are presented. Section 4 is the heart of this article. It shows 
how to reduce a system to a system with only two agents. Section 5 reduces the 
number of auxiliary ports of the agents in this system to 0 and 3. 



2 Interaction Systems 

This model of computing is introduced in . We briefly recall what interaction 
nets and interaction systems are. 



2.1 Agents and Nets 

An interaction net is a set of agents linked together through their ports. An 
individual agent is an instance of a particular symbol which is characterized 
by its name a and its arity n > 0. The arity defines the number of auxiliary 
ports associated to each agent. In addition to auxiliary ports, an agent owns a 
principal port. Graphically, an agent is represented by a triangle: 



In 0 n 1 0 




0 n 1 0 In 



With a, auxiliary ports go clockwise from 1 to n but with a it goes in the 
other direction (an agent is obtained by symmetry from a to a). Here, the 
principal port is noted 0, the auxiliary ports I . . . n. 

An interaction net is a set of agents where the ports are connected two by 
two. The ports that are not connected to another one are the free ports of the 
net and are distinguished by a unique symbol. The set of the symbols of the free 
ports of a net consists the interface of this net. Below, the interface is {y,x}. a 
has one auxiliary port, P has two and e has none. 





X 
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2.2 Interaction Rules and Interaction Systems 

An interaction net can evolve when two agents are connected through their 
principal port. An interaction rule is a rewriting rule where the left member is 
constituted of only two agents connected through their principal ports and the 
right member is any interaction net with the same interface. 

Vn yi Vn yi 



X]^ X\ X}z X\ 





An interaction net that does not contain two agents connected by their prin- 
cipal port is irreducible (we say also reduced). A net is reduced by applying an 
interaction rule to a couple of agents connected through their principal port. 
This step substitutes the couple by the right member of the rule. A reduction 
can be repeated several times. 

An interaction system X = (A, TX) is a set of symbols A and a set of inter- 
action rules 'R, where agents in the left and right members are instances of the 
symbols of A. 

An interaction system X is deterministic when (1) there exists at most one 
interaction rule for each couple of different agent and (2) there exists at most 
one interaction rule for the interaction of an agent with itself. In this case, the 
right member of this rule must be symmetric from a center point. An interaction 
system X is complete when there is at least one rule for each couple of agent. 
In this a paper we consider deterministic and complete systems. With these 
systems, we can prove that reduction is strongly confluent. In fact, this property 
is true whenever the system is deterministic. Moreover, it is assumed that right 
member of every rule has no deadlock and do not introduce an infinite recursive 
computation or a computation that creates a deadlock. Thus, we can always 
erase every part of the right member of a rule with eraser agents noted e. 

3 Universal Interaction Systems 

Universality means that every interaction system can be simulated by a universal 
interaction system. Here, we use a very simple notion of simulation that is based 
on translation. 



3.1 Translation 

Let A and A' be two sets of symbols. A translation from A to A' is a map that 
associates to each symbol in A an interaction net of agents of A' with the same 
interface. This translation is naturally extended to interaction nets of agents of 

A. 
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3.2 Simulation 

We say that a translation <P from S to S' defines a simulation of an interaction 
system I = {S,TZ) by an interaction system I' = {S' ,TZ') if the reduction 
mechanism on interaction nets of I and I' are compatible. More precisely, that 
means that, if M is an interaction net of I then: 

1. Af is irreducible if and only if <P{M) is irreducible; 

2. if M reduces to M then ^(Af) can be reduced to (P{M). 

This definition brings some properties with complete and deterministic in- 
teraction systems: 

— the interaction net corresponding to an agent must be reduced; 

— it has at most one agent which principal port belongs to the interface and 
the symbol of this interface is the same as the symbol of the principal port 
of the initial agent; 

— a translation is a simulation if and only if each rule of TZ is compatible with 

— the simulation relation is transitive and symmetric. 

In this paper, we have an approach that is not exactly the same as in |H|. 
Here, we work only with complete interaction systems but right members of 
interaction rules need not be reduced. They just need to be erasable by e agents. 
However, it has a very small influence on the properties studied here. 



3.3 Universal Interaction System 



An interaction system U is said to be universal if for any interaction system I, 
there exists a simulation of I hy U. In j5], Lafont introduces a system of 
3 combinators 7, S and e defined by 6 rules and he proves that this system is 
universal: 
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4 Universal System with Only 2 Agents 

In this section, we show how to simulate a particular interaction system X with 
an interaction system composed of only two symbols e and Hx- This system has 
exactly 3 rules. 

e agents have no auxiliary port. They erase everything even another e agent 
(you can see for instance the above e-rules). The other symbol Ux has more 
auxiliary ports depending on the complexity of the system which is simulating. 
The rule between e and itself gives the empty net. The rule between e and Ux 
gives m e agents where m is the number of auxiliary ports of Ux- Finally, the rule 
between Ux and itself is complex and depends of X. This rule must be symmetric 
because it is a rule between an agent and itself. 



4.1 Normalizing the Number of Auxiliary Ports in X 



Before giving directly the simulation of a system X, we simulate it by a system 
X' where all the agents have the same number of auxiliary ports except one e 
that has a null arity. 

For X = {U,TZ), let n > 0 be the maximum arity of the symbols. If n = 0, 
we set n = 1. We define S' = {{a ' G X} U {(e, 0)}. e has a null arity 
and erases everything that it meets. The other ones have the same arity n. 

We translate an agent a of arity z in X by an agent a' where the n — i ports 
number i + 1, ... ,n are connected to rz — z agents e. The rule between agents 
a' and /?' is derived from the rule between a and [3 by substituting in the right 
member each agent by its translation and by adding e agents to the symbols in 
the interface that do not correspond to the interface of the rule between a and 
j3. If the arity of a is z and the arity of /3 is j, there are (n — z) -I- (rz — j) e agents 
added to complete the interface of the rule between a' and f3' 

A short proof shows that X is simulated by X' . 



4.2 Simulation with e and JTx 

We can assume that our interaction system X is composed of k symbols all of 
arity n and e agents (which erase everything). This system has proper 

rules between the k agents of arity rz, rz rules between these agents and e which 
are the same (except for the symbol of the agent) and create rz e agents and a 
rule for e and itself which gives the empty net. 



iTx agents. The interaction system X has k symbols which arity is rz and the 
e symbol. It is simulated by a system composed of two symbols: e and Ux- Ux 
has exactly n x k x {k + 2) auxiliary ports. 
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XiXk Yi Yk XijYi XklYk 




The auxiliary ports are grouped together by n. In the picture, they corre- 
spond to two vertical lines. Each group of n auxiliary ports are put in three 
different partitions. Xi,. . . ,Xk are the inputs of the agent. One of this group 
corresponds to the n auxiliary ports of the initial agent of X. The other input 
ports are connected to e agents. The second group of auxiliary ports Yi , . . . , 
are the inputs of the other agent when this agent interacts with another Ux 
agent. The agent connects all the auxiliary ports in this partition to the auxil- 
iary ports of the third group of ports. Finally, XijYx, . . . , XklY^ are the interface 
of the right members of the rules generated by the interaction of this agent and 
another II x agent: 

— n X k input ports Xf, 1 <t < n and 1 < i < fc; 

— n X k intermediate ports Y*, 1 <t < n and 1 < j < A:; 

— X k rule ports {XijYjf , 1 < t < n and 1 < i, j < k. 

The e agent of I is directly translated into itself. An agent a of I, different 
from e is simulated by an agent IIx, n x k links between auxiliary ports of this 
agent and n x — 1) e agents. In fact, the symbols of X that are different 
from e are numbered from 1 to k. Thus, if the number associated to a is i, its 
translation is the net: 



Xi Xi 




In this figure, the small circles are e agents. The n input ports Xl, . . . ,X” 
correspond to the auxiliary ports of a. The other input ports are connected to e 
agents. The n x k intermediate ports are connected to rule ports: for 1 < t < n 
and I < j < k, Yj is connected to (Xi/Yjf . The other rule ports are connected 
to e agents. 

The rule between Tlx and itself. We need to define the interaction rule 
between two agents Ux- To simplify this presentation, we use the symmetric 
agent Ux for bottom agent. The rule is as follows: 
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Ai Xk Vi Yk Xi/Yi Xk/Yk 




I 



Xi Xk YiYk Xi/Yi XilYjXjlYi 




Xk/Yk 






Xi'Xk' Yi'Yk'XilYi' XilY/XjlYi' Xk/Yk' 



In this rule, variables without ' are from top Ux agent and variables with ' 
are from bottom Ux agent. In the right member of the rule, input port Xj of top 
agent is connected to the intermediate port Y/' of bottom agent. Symmetrically, 
Xf is connected to Y/. 

The groups of n rule ports (Xi/Yi)^ , . . . , (Xi/Yi)'^ and 
(Xi/Yi)^' , . . . ,{Xi/Yi)^' are connected to the translation of the rule be- 
tween the agent number i oi I and itself. This interaction net Af/ is by 
hypothesis symmetric, thus is also symmetric and this part of the rule 

between Ux and itself is symmetric. 

For i yf j, the two groups {Xi/YjY , . . . ,{Xi/Yj)'^ and 
{Xj/YiY' . ,{Xj/Yi)^' are connected to the translation ^{Af//) of the 
rule between the agent number i of I and the agent number j . Because there is 
an inversion of ports, the auxiliary ports corresponding to the side of the agent 
number i (the top agent on the figure) are connected to {Xi/YjY , . . . , (Xi/YjY' 
and the auxiliary ports corresponding to the side of the agent number j (the 
bottom agent on the figure) are connected to {Xj/YiY',...,{Xj/Yi)'^'. In 
the same way, the two groups of n rule ports {Xj/YiY,...,{Xj/Yi)'^ and 
{Xi/YjY , ■ ■ ■ ,{Xi/Yj)'^' are connected to the translation <A{Afj) of the rule 
between the agent number i of I and the agent number j but in the opposite 
direction (upside-down). The 4 groups of rule ports {Xi/YjY , . . . , (Xi/Yj)^, 
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{X,/Yi)\ {X,/YiY, {X,/Y,Y\ . . . , {X^/YjY' and {X,/Yi)^\ . . . , 
and the two translations and <P{Afj) is symmetric. 

Thus the rule is completely symmetric. On the figure, because we write the 
rule between Ux and TTx, the right member of the rule has to be horizontally 
symmetric. 

This rule and the erase rules with e define an interaction system Ux with 2 
agents and 3 rules. 



Ux simulates I. This interaction system Ux simulates the interaction system 
I. In fact, we just have to show that the translation of the interaction net 
constituted by two agents of I connected by their principal ports reduces to the 
translation of the right member of the rule between these two agents. This task 
is not so difficult to check. 

It is obvious for e rules. For an agent a of Af and another one /3, the translation 
gives two agents Ux, several links and e agents. In a first step, the two agents 
Ux are reduced. This step replaces the two agents by a set of links and a set 
of translations of right members of the interaction rules from I. A second step 
erases the right members of these rules that do not correspond to the interaction 
between a and /3. 

Below is a simulation of the interaction of the translation of an agent number 
i and an agent number j, i ^ j'- 
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Universal system with 2 agents: first version. Starting with a universal 
system X, we obtain a universal system composed of only two agents Ux and e. 
For instance, with Lafont’s combinators 7, 5 and e, we obtain a universal system 
with e and an agent i7-y,5 which has 2 x 2 x (2 + 2) = 16 auxiliary ports. 
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5 Universal System with 2 Agents and a Minimum of 
Ports 

We have seen that the number of ports of Ux agent is generally very big. For 
Lafont’s universal system, the agentTI-y^s has 16 auxiliary ports. We can reduce 
this system to an interaction system with 2 agents, one with no auxiliary port 
and the other with only 3. 

This transformation is done in three steps. The first step adds a multiplexor 
agent /r. The second step reduces the number of auxiliary ports of Ux using the 
multiplexor. It leads to a new agent with only one auxiliary port. Finally, a 
last step merges together /r and in a single agent with three auxiliary ports. 



Adding /x agents, p, agents have 2 auxiliary ports. Their construction is the 
same as the multiplexor introduced in |H|. In fact, we can use either Lafont’s 
combinators S or 7 . The rule between fi and itself is as follows (5 version): 




The multiplexor with 2 auxiliary ports can obviously be extended to multi- 
plexors with n X k X {k + 2) auxiliary ports (the number of auxiliary ports of 
Ux) using nxkx{k + 2) — Ifi agents. 



Reducing the ports of TTx to 1 auxiliary port. Then, we can transform 
Ux into an agent with only 1 auxiliary port followed by a n x fc x (A; -|- 2) 
multiplexer: 



XiXfc Yi Yfc Xi/YiXfc/Yie 
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The rule between 7T^ and itself is deduced from the rule between Tlx and 
itself by replacing in the right member of this rule each occurrence of TTx by its 
translation and by folding both group of n x fc x (fc + 2) auxiliary ports with a 
n X k X {k + 2) multiplexor. 



Merging /.t and U^. The functionalities of /i and TTj- do not overlap. In fact 
we only use the rule between /i and itself and and itself but never /i with 
TTj. As a consequence, we can put together these two agents into a single agent 
/X X TTj. The two translations from fj, to fix and from TT^ to /x x are as 
follows: 




The rule between two fj, x is as follows: 




Universal system with 2 agents: second version. Starting with a universal 
system I, we obtain a universal system composed of only two agents ^ x and 
e. The first one has three auxiliary ports and the second one has no auxiliary 
port. 



6 Conclusion 

Three results are given in this paper. The first one gives a way to simulate every 
interaction system with a system composed of only two symbols. A corollary is 
that there exists a universal interaction system with only two symbols (in |H| the 
universal system has 3 symbols). In the last part of this article, it is shown how 
to reduce the number of auxiliary ports of one of the symbols to only 3. This 
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leads to a universal system with two symbols, one without auxiliary port and 
the other with only 3. 

We have succeeded in finding a simpler universal system than Lafont’s one. 
However, the price is that the right member of the rule between or /i x ilj- 
and itself is big and not very pleasant (like Lafont’s system). One may wonder 
if it is possible to find a system that has simpler right members for the rules. 
Moreover, an open question remains: is it possible to find a universal system 
with 2 agents one with 0 auxiliary ports and the other with 2 (less is obviously 
impossible)? 
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Abstract. Extensions of the simply typed lambda calculus have been 
used as a metalanguage to represent “higher order term algebras” , such 
as, for instance, formulas of the predicate calculus. In this representa- 
tion bound variables of the object language are represented by bound 
variables of the metalanguage. This choice has various advantages but 
makes the notion of “recursive dehnition” on higher order term algebras 
more subtle than the corresponding notion on first order term algebras. 
Despeyroux, Pfenning and Schiirmann pointed out the problems that 
arise in the proof of a canonical form theorem when one combines higher 
order representations with primitive recursion. 

In this paper we consider a stronger scheme of recursion and we prove 
that it captures all partial recursive functions on second order term alge- 
bras. We illustrate the system by considering typed programs to reduce 
to normal form terms of the untyped lambda calculus, encoded as ele- 
ments of a second order term algebra. First order encodings based on de 
Bruijn indexes are also considered. The examples also show that a version 
of the intersection type disciplines can be helpful in some cases to prove 
the existence of a canonical form. Finally we consider interpretations of 
our typed systems in the pure lambda calculus and a new godelization 
of the pure lambda calculus. 



1 Introduction 

Despite some limitations, first order term rewriting systems can serve as a frame- 
work for functional programming. In this approach one has a first order signature 
whose symbols are partitioned in two sets: constructor symbols and program 
symbols. A data structure is identified with the set of normal forms of a given 
type built up from a given set of constructor symbols. To each program symbol is 
associated a set of recursive equations which, when interpreted as rewrite rules, 
define the operational semantics of the program. 

* Dedicated to Nicolaas G. de Bruijn 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 15-EII 2001. 

© Springer- Verlag Berlin Heidelberg 2001 



16 



A. Berarducci and C. Bohm 



This approach however is not adequate to handle programs on syntactic data 
structures involving variable-binding operators. Consider for instance a program 
to find the prenex normal form of a formula of the predicate calculus, or a pro- 
gram to reduce lambda terms. In principle one can encode the relevant data 
structures as first order term algebras, and represent the corresponding pro- 
grams by first order term rewriting systems. It is however more convenient to 
assume that the programming environment (namely the “metatheory” ) is based 
on some version of the lambda calculus. In this approach bound variables of the 
object language are represented by bound variables of the metalanguage (rather 
than by strings of characters or de Bruijn indexes). This means that one can 
make use of the built-in procedures to handle renaming of bound variables and 
substitutions, which otherwise must be implemented separately in each case. 
Along these lines in shows that many important syntactic data structures in- 
volving variable-binding mechanisms can be represented in a version of the typed 
lambda calculus with additional constant symbols (higher order constructors) . In 
this representation scheme the elements of a given data structure correspond to 
the “canonical forms” of a certain type built up from a certain set of higher order 
constructor symbols. We call “higher order term algebra” the set of canonical 
forms representing the elements of a given data structure. 

In order to do some functional programming on such algebras we must in- 
troduce a notion of “recursion” . A difficulty is that an element of a higher order 
algebra is a normal form of atomic type, and yet it can have subterms of func- 
tional type; so it is not clear, at least semantically, in which sense these algebras 
can be considered to be inductively defined. The problem is considered in na 
IE] under different perspectives. In the presence of recursion the existence of 
canonical forms becomes a rather delicate issue. Essentially this depends on the 
fact that, a recursively defined function applied to a formal parameter (i.e. a 
bound variable), cannot reduce to a canonical form. In other words, to be able 
to evaluate a recursive function, or more generally a function defined by cases, it 
is necessary that its recursive argument is “constructed” rather than parametric. 
Thus, to prove that a given term has a canonical form, one must ensure that 
occurrences of recursively defined programs applied to formal parameters do not 
arise dynamically during the computation. The solution proposed in in the 
case of primitive recursion on higher order algebras, is a typed lambda calculus 
enriched with modalities whose purpose is to make a type distinction between 
parametric and constructed objects. This is reminiscent of the notion of “safe” 
recursion of or of the “tiers” of m- Following a different line of research. 
Constable El introduced an extension of the typed lambda calculus with gen- 
eral recursion, based on fixed points operators. Here one is only interested in 
first order data structures (booleans, integers), and the issues are different: the 
main type distinction to be made concerns partial versus total functions rather 
then parametric versus constructed objects. 

Inspired by the work of these authors and continuing the work done in jOl 
E] in an untyped setting, we propose an extension of the simply typed lambda 
calculus with a scheme of recursion based on the distinction between program 
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symbols and constructor symbols. We show that our scheme is able to capture 
all partial recursive functions on first order or second order term algebras. 

Much of the emphasis of this paper is on the examples. We begin with a pro- 
gram to compute the prenex normal form of a formula of the predicate calculus 
(represented as in |14|1. Then, as an example of a partial recursive function on 
a higher order term algebra, we consider (typed) programs to reduce terms of 
the untyped lambda calculus to normal form (where inputs and outputs are en- 
coded as elements of a higher order term algebra). Similar programs have been 
discussed in ura in an untyped setting. 

In the case of partial functions the canonical form theorem takes the following 
conditional form: if a computation terminates, then it terminates in a canonical 
form. To prove that a given program has this property, a typing system based 
on the simple types may not be sufficient and a more refined typing based on 
a variant of the intersection type disciplines of may be useful. We illustrate 
this fact proving the canonical form theorem for our program to reduce lambda 
terms. 

All the experiments have been computer-tested by a software called “CuCh 
machine”. The CuCh machine is essentially an interpreter of the pure lambda 
calculus together with a macro to transform recursive definitions into lambda 
terms developed by Bohm and Piperno and explained in m- Quite remarkably 
the interpretation described in these papers does not make use of the fixed point 
operator. 

2 Extensions of the Typed Lambda Calculus 

Given a set of atomic types we generate a set of types as follows. 

Definition 1. The types a are generated hy the grammar 

a ::= {atomic type) | oi x . . . x — >■ {atomic type) 

Note that we allow cartesian products only on the left of the arrow. So ai x a-i 
is not a type. 

Definition 2. A signature S is a set of “constant declarations” of the form 
c:a, where a is a type (over a given set of atomic types). Ifc:a € S we say 
that c is a constant symbol of type a. 

We allow a constant symbol to have more than one type in a given signature, 
even an infinite set of types. If the set is finite, this amounts to the possibility 
of assigning to a constant symbol the intersection of all the types of the set, in 
the sense of the intersection type disciplines of 

Definition 3. Given a signature E , a basis B is a set of “parameter declara- 
tions” of the form y : a, where y is a variable and a is a type. We stipulate 
that a basis cannot contain two different declarations y : a and y : f) for the 
same variable. The following type assignment system defines inductively the set 
of terms t of type a over a basis B. 
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(Const.) 



c:a € E 



(Var.) 



x:a € B 
B \- x:a 



H I) 



^ . OL 2 5 ■ • ■ j ^ ^ ^ 

B h Xx\ . . . Xu’t\ ai X . . . X — >■ a 



H E) 



B h t: Oi X . . . X a„ — >■ a B\- ti.ai ■ ■ ■ S h t„: o;„ 
B \- tti ... tn'. a 



Air 1 ■ B' D B 

(Weakening) — — 

B' h t: a 

In rule (— >■ I), B does not contain declarations for x\, . . . ,x„. If we can deduce 
B \- t: a by the above rules we say that t is a term of type a over the basis B. 

Note that \xy.t cannot be applied to a single argument: it necessarily requires 
two arguments (so it is not identified with Xx.(Xy.t), which is not a legitimate 
term because we only allow atomic types on the left hand side of an arrow). 
This feature of our system enables to encode bijectively the elements of various 
(higher order) data structures as the closed normal forms of a given atomic 
type over a given signature (thus rendering superfluous the distinction between 
normal form and canonical form in HI)- 

The behaviour of the constant symbols of the signature is dictated by a set 
of reduction rules. 

Definition 4. Given a signature E, a reduction rule is a pair (ti,t 2 ); written 
ti '■= t2, such that for every basis B and every type a, if B h ti'. a, then B h t2- a. 

Definition 5. A set P of reduction rules over a signature E determines an 
extension A(E, P) of the simply typed lambda calculus as follows. The terms 
of this calculus are as defined in Definition 0 The reduction relation 
between typable terms of A{E,P) is obtained by adding to the fi-rule 
{Xxi . . . Xn.t)ti . . .tn — t t[ti/xi, . . . ,tn/Xn] all the rcducUons ti — >■ t2 for each 
reduction rule t\ := t2 of the program P. By definition, the reduction relation is 
transitive and closed under substitutions and contexts. We identify terms which 
differ only by a renaming of bound variables. 

The following “subject reduction” theorem holds. 

Theorem 1. If ti — >■ t2 in A(E, P) and B h ti: a, then B h t2'. a. 

3 General Recursive Programs on Second Order Term 
Algebras 

So far nothing forbids the presence of reduction rules for which the Church- 
Rosser property for A(E,P) fails. We will now introduce further restrictions on 
the shape of the reduction rules which are always satisfied in all the examples, 
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and which ensure that the Church-Rosser property holds and that A{E,P) is 
interpretable in the untyped lambda calculus. This is done by distinguishing 
between constructor symbols and program symbols (or destructors), as in |15lbl 

ECS- 

Definition 6. We say that a set of reduetion rules P over a signature S is di- 
chotomic if the eonstant symbols of S can be partitioned in two sets, program 
symbols and constructor symbols in such a way that each reduction rule of 
P has the form 



{prograrri) {{constructor) Xi, . . . , x„)yi . . .pm ■= t 

We assume moreover that the rules of P are mutually exclusive and exhaustive 
in the sense that for each closed term t whose outermost symbol is a program 
symbol, and whose first argument begins with a constructor symbol, there is one 
and only one rule ti := t 2 such that t is a substitution instance ofti. We also 
require that each program symbol appears in as the left-most symbol of at least one 
equation of P. So if P is empty all the symbols of the signature are constructor 
symbols. Finally we will assume that the right-hand sides of the reduction rules 
of P have no [3-redexes. 

We now make some further assumptions concerning the complexity of the types 
of the signature. 

Definition 7. The level of a type is defined as the maximum number of nested 
occurrences of arrow symbols. 

So the level of an atomic type is zero and the level of a type of the form ai x 
. . . X — >■ /3 is the maximum of the levels of oi, . . . , a„ plus 1 (recall that (3 is 
atomic, so it has level zero). 

Definition 8. Given a dichotomic set of rules P over a signature S , we say 
that S and P are second order, if the types of the program symbols have level 
at most 1 and the types of all constructor symbols have level at most 2. 

All the examples we consider in this paper are based on dichotomic rules on a 
second order signature. The second order assumption will be used in section 0 
to prove the representability of every partial recursive function. 

4 The Prenex Normal Form Example 

Our first example of a program on second order term algebras is a procedure 
Pnf to put a formula of the predicate calculus into prenex normal form. 

Definition 9. (Formulas of the predicate calculus) For simplicity we consider 
formulas of the predicate calculus whose only non-logical symbol is a binary re- 
lation symbol A. Following to represent such formulas as typed terms of the 
extended lambda calculus we need an atomic type l (individuals) , an atomic type 
o (predicates) , and the following constructor symbols: 



20 



A. Berarducci and C. Bohm 



Fa : (i — >■ o) — >■ o 
Ex : (t — >■ o) — >■ o 
not : o — >■ o 
imp : o X o — >■ o 
A : t X i — >■ o 

Universal quantification 'ixP will be represented as F&{Xx.P) and existential 
quantification 3xP will be represented as Ex(Xx.P). So a quantifier transforms 
a function from individuals to predicates into a predicate. The constructors not 
and imp stand for the negation and the implication sign. 

For example, the formula 'ix'iy{kxy — >■ ~^Ayx) is represented by the term 
Fa(Ax.(Fa(Ay.imp(Axy)(not(Ai/a:))))) of type o. It is easy to see that the rep- 
resentation function maps bijectively closed formulas of the predicate calculus 
into closed normal forms of type o. 

Definition 10. (Prenex normal form) The program Pnf defined below, com- 
putes the prenex normal form of a formula. 

Program symbols: 

Pnf -. 0^0 
nPnf : o ^ o 
iPnf : o X o ^ o 
iP2 : o X o — >■ o 



Reduction rules: 

Pn/(Fat) :=Fa(AT.(Pn/(tT))) (1) 

Pnf(Ext):=Ex{XT.{Pnf{tT))) (2) 

Pnf {not L) nPnf{Pnf L) (3) 

Pnf {imp L M) := iPnf (Pnf L){Pnf M) (4) 

Pnf{kuv)-.= kuv (5) 

nPnf{Fa.t) := Ex{XT.{Pnf{not{tT)))) (6) 

nPnf{Ext) Fa.{XT.{Pnf{not{tT)))) (7) 

nPnf {not L) := L (8) 

nPnf {imp L M) := not{imp L M) (9) 

nPnf{kuv) := not{kuv) (10) 

iPnf{Fa.t)y:= Ex{XT.{Pnf{imp{tT)y))) (11) 

iPnf{Exf)y := Fa{XT.{Pnf{imp{tT)y))) (12) 

iPnf {not L) y := iP2 y(not L) (13) 

iPnf {imp L M) y := iP2 y {imp L M) (14) 

iPnf{kuv)y:=iP2y{kuv) (15) 

iP2{Fnt) x := Fa(AT.(Pn/(impx(t T)))) (16) 

iP2{Ext) X := Ex{XT.{Pnf {imp x{tT)))) (17) 
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iP2 {not L) X := ±mTp X {not L) (18) 

iP2 {imTp L M) X := ±mTp x{±mp L M) (19) 

iP2{kuv) X := imp x{kuv) (20) 



The above set P of equations defines an extension A{P, S) of the typed lambda 
calculus as in Definition^ (where S is signature of P). 

Remark 1. The role of the various equations should be clear. For instance equa- 
tion 11 takes care of the fact that the prenex normal form of a formula of the 
form (Va:A) B is equal to the prenex normal form of 3x{A — ^ B), provided x 
does not occur free in B. Note that the proviso is implicitly taken into account 
by the built-in rules of the A-calculus with renaming of bound variables to avoid 
unwanted capture of variables. 

5 A Partial Recursive Example 

The program for the prenex normal form is total recursive, namely it always 
terminates. As an example of a partial recursive second order program we will 
define a program Nf (typable within our system) to reduce closed terms of the 
untyped lambda calculus to normal form. The program Nf is an improvement 
of the one we presented in jOj (in an untyped metatheory). The main novelty 
is the use of the auxiliary data structure “list” which allows for a considerable 
gain in efficiency. Our program should also be compared with the one in im, 
which is very elegant but conceptually rather complex, as it requires, recursively, 
a duplication of terms into a “functional part” and an “argument part” . 

Definition 11. To represent untyped lambda terms in our typed metatheory we 
use the following signature Sq of second order constructors. 

App : exp X exp — ^ exp; 

Abs : (exp — ^ exp) — ^ exp 

Definition 12. Given a term t of the untyped lambda calculus, define |"f] in- 
ductively as follows: 

[a;] := x; 

[MTV] := App[M][iV]; 

[Ax.M] := Abs(A X. |"M] ) 

So for instance 

|"(Ax.xx)(Ax.xx)] = App(Abs(Ax.Appxx))(Abs(Ax.Appxx)) 



Remark 2. The above encoding is adequate: the map 1 1 — >■ |"t] is a bijection from 
terms of the untyped lambda calculus, to normal forms of type exp over Sq . 
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The above representation is a variant of the one in m where the author defines 
w = Var X, with Var a new constructor. We will use the following characteri- 
zation of normal forms: 

Remark 3. 



nf( xAi . . . A„) = X nf(Ai) . . . nf(A„); (1) 

nf(Aa;.i?) = Xx.ni{B); (2) 

nf{{Xx.B)Ai . . . An) = rA{B[Ai/x\A 2 ■ . . An) (3) 



We will define a program Nf such that Nf [t] = [nf(t)] for each closed term t 
having a normal form nf(t) . Although we are not interested in the behaviour of 
Nf on open terms, the above equations suggest that to carry out the recursion 
we are forced to take into account not only closed terms, but also open terms. 
However we cannot hope to have Nf [t] = [nf(t)] even for open terms, because 
otherwise taking n = 0 in the first equation above we get Nf [cc] = \x), which 
implies Nfa; = x (as [a;] = x), namely Nf is the identity function. So we 
must decide what is the relevant equation for Nf on open terms. The solution is 
Nf I'^IBox _ |"nf(t)], where is obtained from [f] by substituting each free 

variable x by Box a;. Here Box is an auxiliary constructor of type exp — ^ exp. 
So for instance \Xy.xy~\^°^ = Abs(A t/.App(Box a;) y). Note that for closed terms 

We are finally ready to give the reduction rules defining Nf. The idea is 
that Nf will introduce a Box each time an abstraction is passed over, and will 
eliminate it when the corresponding variable is reached. The dots “. . .” in the 
equations of Remark 0 suggest the use of the data structure “list” . In other 
words it is convenient to generalize the program Nf : exp — ^ exp to a program 
Reduce : exp x list — >■ exp which compute the normal form of a term applied 
to a list of terms. 



Definition 13. (A program to compute the normal form of a closed lambda 
term) Besides the constructors App, Abs we use the following auxiliary construc- 
tor symbols: 

Box : exp — ^ exp; 
nil : list; 

cons : exp x list — > list 

where list is an atomic type to represent lists of objects of type exp. 

Program symbols: 

Reduce : exp x list — ^ exp; 

RBox : list x exp exp; 

RAbs : list x (exp exp) — exp 



Reduction rules: 



Reduce (App xy) L := Reduce a:(cons y L); 



( 1 ) 
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Reduce{khs f) L := RAbs L f; (2) 

Reduce {Boxu) L := RBox Lu] (3) 

RAbs nil /:= Abs(A M.iJedttce ( /(Box it))nil); (4) 

RAbs {cons y L) f := Reduce { f y) L; (5) 

RBox nilu := u; (6) 

RBox {cons y L)u := RBox L{kpp u{Reduce ynil)) (7) 



The above rules define a system A{P, S), and within that system we define Nf = 

Xx .Reduce a; nil. 



Note that an occurrence of the auxiliary constructor Box is introduced in step 4 
and is eliminated in step 3. 



Example 1. (Example of the computation of a normal form) 

Let us compute the normal form of the term {\xy.xyy)Xx.x. We will use the 
notation t\ — >i t2 to express that the term t\ has been rewritten as t2 applying 
the rule i to a subterm of t\ . For better readability we will underline the relevant 
subterm (when it is not the whole term). 




Reduce (App(Abs(A x.Abs(A y.App(App x y) y)))(Abs(A x. x)))nil 
Reduce (Abs(A a;.Abs(A j/.App(App x y) j/)))(cons(Abs(A x. x))nil) 
RAbs (cons(Abs(Ax. a;))nil)(Aa:.Abs(A j/.App(Appa;?/) y)) 
Reduce ((Aa:.Abs(A?/.App(Appa:?/) y)){khs{Xx. a;)))nil 
Reduce (Abs(A y.App(App(Abs(A x. x)) y) y))nil 
RAbs nil(A y.App(App(Abs(A x. x)) y) y) 

Abs(A u. Reduce ((Ai/.App(App(Abs(Aa::. a;)) y) j/)(Boxu))nil) 
Abs(A u. Reduce (App(App(Abs(A x. a;)) (Box u))(Boxu))nil) 

Abs(A u. Reduce (App(Abs(Aa;. a:)) (Boxu)) (cons (Box u)nil) 

Abs(A u. Reduce (Abs(A x. a;)) (cons (Box u) (cons (Box u)nil))) 
Abs(Au.RAbs (cons(Box u)(cons(Box u)nil))(A a;, a:)) 

Abs(A M. Reduce ((Ax. x) (Boxu)) (cons (Box u)nil) 

Abs(A u. Reduce (Box u) (cons (Box u)nil)) 

Abs(Au.RBox (cons(Boxu)nil)u) 

Abs(A u.RBox nil(App u(Reduce (Box u)nil)) 

Abs(A u.App u(Reduce (Boxu)nil)) 

Abs(Au.Appu(RBoxnil u)) 

Abs(A u.App uu) 



A formal proof of correctness of the program Nf is long and tedious: it is based 
on the observation that our reduction rules simulate the equations in Remark 0 



Remark /. The program in Definition 1 1 31 unlike the program in ini> has the 
property that Nf |"t] is strongly normalizing (every reduction path terminates) 
whenever t is strongly normalizing. 
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6 Canonical Forms 



One of the main difficulties which must be overcome when programming on 
higher order term algebras is that the typing information may not be sufficient 
to ensure that a given closed normal form is canonical, namely it represents an 
element of a given data structure. 

Example 2. Consider the system A{P, E) of Definition ^1 By Remark E] the 
closed normal forms of type exp over the signature {App, Abs} C E represent 
the untyped lambda terms. Unfortunately within the system A(P, E) there are 
“exotic” closed normal forms of type exp (over E) which do not represent any 
lambda term. Examples of such terms are Abs(Aa;.Nf x) and Box(Aa;.x). The 
first term is particularly bad since it is a normal form containing an occurrence 
of a program symbol: this difficulty arises with second order programming and 
has no analogue in the first order case (i.e. when all constructors have level at 
most 1). 

The presence of exotic terms shows that the typing information may be too weak 
to prove that a program has the correct range. To solve the problem we use a 
feature of our system that we have not yet exploited: the fact that we allow 
constructor symbols to have more than one type. The following result illustrates 
the technique. 

Theorem 2. ( Canonical form theorem for Nf ) Given a closed term t of the 
untyped lambda calculus, if Nf\f] has a normal form in the system A{P,E) of 
Definition m then its normal form has the shape |"t'] for some t' . 



Proof. By the adequacy of the representation (Remark El), it suffices to show 
that the normal form of Nf [t] is a term over the signature {App, Abs} C E 
(necessarily of type exp by the subject reduction theorem). To this aim we 
redefine the signature used in Definition El as follows, using two new atomic 
types Dexp and Dlist {warning', we are not taking □ as a new type constructor): 



App 

Abs 

Box 

nil 

cons 

Reduce 

RBox 

RAbs 



exp X exp — ^ exp, Dexp x Dexp — >■ Dexp; 

(exp — > exp) — ^ exp, (Dexp — >■ Dexp) — >■ Dexp; 
exp — >■ Dexp; 

□list; 

□exp X □list □list; 

□exp X □list — > exp; 

□list X exp — ^ exp; 

□list X (□exp — ^ □exp) -4 exp 



So App and Abs have two types each. The reduction rules of Definition El are 
still correctly typed in this new signature and Nf has type □exp — >■ exp. The 
idea is that □exp represents the objects of the form and exp represents 

the objects of the form [t], as in the discussion following Remark 01 Over the 
new signature the representation |"t] of a closed term t, has both type exp and 
□exp, so Nf [t] is correctly typed and has type exp. By the subject reduction 
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theorem if the computation of Nf |"t] terminates, the result has type exp. An 
easy induction shows that a closed normal form of type exp is necessarily a term 
over the signature {App, Abs} C A, hence it is a term of the form [t'] . 

Remark 5. The signature S of Definition El is obtained from the signature S' 
in the proof of Theorem El by identifying Dexp with exp and Dlist with list. 
This suggests the following definition. 



Definition 14. We say that a signature S' refines a signature S , if S and S' 
have the same number of constant symbols, with the same names but possibly 
different types, and S can obtained from S' by substituting, in the constant 
declarations, some atomic types with other types (so in particular by identifying 
some atomic types). 

Theorem El shows that to prove that a program of a given system A{P, S) has the 
correct range (or more generally has some desired property), it may be convenient 
to try to refine the signature, in such a way that the new signature still respects 
P. Note that when we refine a signature we make less terms typable: in fact, if 
S' refines S, and a closed term t has type a over S', then t has type a" over 
S, where a" is a substitution instance of a (with the same substitution as in 
the definition of refinement). The notion of refinement is clearly related with the 
notion of principal type scheme in the intersection type disciplines m- 

Example 3. Another interesting refinement of the signature of Definition El 
which respects the corresponding reduction rules, uses infinitely many types for 
each constant symbol parameterized by an index i ranging over the integers Z: 

Abs : (expi+i expi+i) expi+i; 

App : expi+i X expi+i expi+i; 

Box : expi — >■ expi+i; 

nil : listi; 

cons : expi-|_i x listi+i — ^ listi+i; 

R : expi_|_i X listi+i — expi; 

RBox : listi+i x expi — ^ expi; 

RAbs : listi+i x (expi+i — >■ expi+i) — > expi 

Now consider the terms Ax.Nf(Nf x) and Abs(Acc.Nf x). Both terms can be 
typed in the original signature of Definition II dl in which Nf has type exp — >■ exp. 
Neither of them can be typed in the refined signature in the proof of Theorem E| 
in which Nf has type Dexp — ^ exp. Only the first one can be typed in the refined 
signature of Example 0 Collecting information coming from different signatures 
we gain insight on the program Nf. 



7 Prom Second Order to First Order Representations 

Using de Bruijn indexes we can pass from a second order to a first order rep- 
resentation of lambda terms (or formulas, or any other second order syntactic 
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structure). Once a second order program is found, it is easy to transform it into a 
first order program by implementing the relevant procedures to handle variable 
substitutions. To illustrate this idea we modify the program of Definition fT!^ to 
obtain a program to reduce a lambda term to normal form in de Bruijn notation. 

Definition 15. (De Bruijn notation) We recall the de Bruijn notation for 
lambda-terms. In this notation variable occurrences are replaced by positive in- 
tegers. For example \x.\y.xy{\z.x) becomes AA21(A3). The positive integer n 
indicates a variable which is bounded by the n-th occurrence of A going upward 
in the parsing tree of the term. If such an occurrence of A does not exist, then 
the integer indicates a free variable. 

Note that closed terms which differ only for a renaming of bound variables have 
the same de Bruijn notation. 

Definition 16. To represent lambda terms in de Bruijn notation we use the 
following signature: 

Constructor symbols: 

1 : nat; 

S : nat nat; 

var : nat — ^ term; 

abs : term — ^ term; 

app : term x term — ^ term 

For example AA21(A3) becomes abs(abs(appu2ul)(absu3)), where 2 = 5(1), 
3 = 5(2), vn = (varn). The de Bruijn terms correspond bijectively to the 
closed normal forms of type term. 

The program nf : term — >■ term defined below reduces a de Bruijn term 
to normal form. Unlike the higher order program Nf of Definition C3 it works 
well even when applied to representations of open terms (the free variables are 
represented by de Bruijn indexes which point “above” the root of the term). 
The reduction rules in the definition of nf are almost identical to those of Nf . 
The main difference is in equation (5) below, where the auxiliary program sub 
is used to simulate the single /3-reduction which in the higher order program is 
built-in. 

Definition 17. (A program to reduce de Bruijn terms to normal form) We set 
nf = X X. reduce xnil where reduce is defined as follows: 

Auxiliary constructor symbols: 
nil : list; 

cons : term x list — >■ list 
Program symbols: 
reduce : term x list — term; 

Rabs : list x term term; 

Rvar : list x term — > term; 
sub : term x term — ^ term; 

subs : term x term x nat — ^ term 

update : term x nat x nat — ^ term 
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Reduction rules: 



reduce (app xy) L := reduce x (cons y L); 
reduce (abs /) L := Rabs L /; 
reduce {v&rn) L := Rvar L (varn); 



( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 
( 7 ) 



Rabs nil f := a.hs{reduce /nil); 
Rabs (cons y L) f ■= reduce {sub f y) L; 



Rvarnil u := u; 

Rvar {cons y L) u := Rvar A(app u{reduce y nil)) 



The purpose of sub f y in equation (5) is to find the f3 -reduct of abs / applied 
to y. We set sub = X f y.subs fyO where subs is defined as follows: 



The program update takes care of updating the de Bruijn indexes of the free 
variables after the substitution performed by subs . 

We do not enter in the details of the equations for the updating of the de Bruijn 
indexes since similar equations have already been used by various people, see for 
instance [Q and In these papers the authors consider a “lambda calculus 
with explicit substitutions” using a suitable notation based on de Bruijn indexes. 
What we do is different: we are not defining a lambda calculus, but rather a pro- 
gram to reduce lambda terms. So in our approach along with the normalization 
program, we can also define a wealth of other programs on lambda terms, for 
instance a program to count the variables. The lambda terms, represented in 
de Bruijn notation, do not reduce by themselves to normal form: they must be 
given as inputs to the normalization program. 

8 Computability of All Partial Recursive Functions 

We define a second order term algebra as the set of closed normal forms of 
a given atomic type a over a given signature S of second order constructors. 
So such an algebra can be denoted by the pair (a, S). We have seen that many 
interesting data structures can be represented by second order term algebras. 



subs (appux) xm app(swbs uxm){subs xxm); 
subs (abs u) xm := nbs{subs ux {m -b 1)); 



( 8 ) 

( 9 ) 




( 10 ) 



update {app X y) m j := app{update xmj){update ymj); 
update {abs x) m j := abs{update xm{j -\- 1)); 



( 11 ) 

( 12 ) 

(13) 
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Theorem 3. For every partial recursive function f between two second order 
term algebras (oi, 27i) and ( 02 ,^ 2 ), there is a dichotomic set of reduction rules 
over a second order signature E Z) S\U S 2 which computes f . A similar result 
holds for functions of several arguments. 

Proof. (Sketch) We take for granted that the result is true for first order alge- 
bras ( “folklore theorem” ) , namely when the constant symbols of the signatures 
have level 1. In the general case the idea is to show that to every second order 
term algebra we can associate a first order term algebra and a bijection between 
the two algebras, such that the bijection and its inverse are computable by a 
dichotomic set of reduction rules over a second order signature. For instance the 
second order term algebra of type exp that we have used to represent untyped 
lambda terms (Definitional) can be associated bijectively to the first order term 
algebra of type term which represents lambda terms in de Bruijn notation (Def- 
inition IS. For the details of how to translate between the two representations 
using a dichotomic set of reduction rules see the appendix. 

9 Interpretation in the Pure Untyped Lambda Calculus 

Definition 18. An interpretation of A{E,P) into the untyped lambda calcu- 
lus is a map (p which assigns to every closed term t of A{E, P) a closed term U 
of the untyped lambda calculus and has the following properties: 

1. {tt\ . . . tn)'^ = F^tf . . .t^, (Axi . . . Xn.f)^ = Axi . . . Xn.{t^). So (j) is uniqucly 

determined by its restriction to the symbols of the signature. 

2. If t\ — >■ t 2 in A{E,P), then tf — >■ t^ in the untyped lambda calculus. 

The above definition admits many variants. A weaker notion is obtained 
by replacing the reduction relation by the convertibility relation in clause 2. A 
stronger version is obtained by requiring that the converse implication of clause 
2 also holds. A reasonable compromise is to require that the interpretation is 
injective on the data structures. This can be formalized as follows: 

Definition 19. An interpretation (p ofA(S,P) into the untyped lambda calculus 
is injective on the data structures if whenever ti and t 2 are distinct normal forms 
of A{E, P) having atomic type and not containing program symbols, then tf and 
tf have distinct normal forms. 

Using a technique introduced by Bohm and Piperno and studied in m one 
can prove the following theorem: 

Theorem 4. A{E, P) can be interpreted in the untyped lambda calculus by an 
interpretation that is injective on the data structures. 

Quite remarkably the interpretation described in [9lbj does not make use of 
the fixed point combinator. Using this fact it is shown in that in the first 
order case the interpretation “preserves strong normalization” . 
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A different interpretation of an higher order term algebra into the pure 
lambda calculus is obtained by replacing the constructors by variables and ab- 
stracting them, as in the definition of the Church numerals (see pi8ll7| for 
similar proposals). 

Consider for instance the higher order algebra of type exp in Definition El 
The term |'(Ax.a;x)(Aa:.a;a;)] = App(Abs(A a:.App a; a;))(Abs(A x.App a; a;)) is an ele- 
ment of that algebra. By replacing the constructors App, Abs by variables a, b and 
abstracting them, we obtain the lambda term Xab.a{b{Xx.ax x)){b{Xx.ax x)) 
(which is a normal form). 

In this way we have defined an embedding 1 1 — >■ 9{t) of the pure lambda calcu- 
lus into itself (e.g. 9 sends (Xx.xx)(Xx.xx) into Xab.a(b(X x.a x x))(b(X x.a x x))) 
which is probably the simplest “godelization” of the lambda calculus which has 
ever been considered (compare with jlYlbldj h The name “godelization” is jus- 
tified by the fact that there is a combinator which defines a bijection from the 
image of 9 onto the Church numerals. This can be easily proved applying the 
interpretation of m to the translations between first order and higher order 
term algebras given in the appendix. We also need the fact that all infinite first 
order algebras (suitably embedded in the lambda calculus) admit a lambda de- 
finable bijection onto the Church numerals. Note that the image of 9 consists of 
typable terms of type (exp x exp — ^ exp) — ((exp — exp) exp) — >■ exp. 

10 Appendix 

The program M : term — >■ exp below, translates from de Bruijn notation as in 
Definition El to lambda terms represented as in in Definition [Q We need an 
auxiliary program ch : term x term x nat term and an auxiliary constructor 
Bx : exp — > term. 



We now define a term Ax.db xO : exp —>■ term which performs the inverse 
translation. The program db : exp x nat — term uses the auxiliary constructor 
Var: nat — ^ exp (not to be confused with var: nat -T term). 



M (abs t) := Abs(A u.M (ch t(Bx u)l) 
M (app X y) App(M x)(M y) 



( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 
( 7 ) 



M (Bxu) := u 

ch (appxy) uj := App(ch xuj){ch yuj) 
ch (abs x) uj := Abs(ch x rt(l -|- j)) 
ch {va.Tm)uj := if m = j then u else (Varm) 



ch ( Bxx) uj Bxx 



db (Abs t) n := abs(db ( t(Varn))(l -|- n)) 
db (Appxy) n := app(db xn)(db yn) 
db (Var m) n := var( n — m) 



( 8 ) 

( 9 ) 

( 10 ) 
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Abstract. The constraint language for lambda structures (CLLS) can 
model lambda terms that are known only partially. In this paper, we 
introduce beta reduction constraints to describe beta reduction steps 
between partially known lambda terms. We show that beta reduction 
constraints can be expressed in an extension of CLLS by group paral- 
lelism. We then extend a known semi-decision procedure for CLLS to also 
deal with group parallelism and thus with beta-reduction constraints. 



1 Introduction 

The constraint language for lambda structures (CLLS) |7l6l8j can model A-terms 
that are known only partially. The idea is to see a A-term as a A-structure: a 
tree decorated with binding edges. One can then describe a A-term partially as 
one would describe a standard tree structure. CLLS provides dominance ITHEl 
0, parallelism 0 and binding constraints for this purpose. 

This paper shows how to lift /3-reduction to partial descriptions of A-terms in 
CLLS. We define beta reduction constraints, which allow a declarative description 
of the result of a single /3-reduction step. At first, this description is very implicit; 
it is made explicit by solving the constraints. To this end, we show how beta 
reduction constraints can be expressed as group parallelism constraints. Then 
we adapt a known semi-decision procedure for CLLS to also deal with group 
parallelism and thus with beta-reduction constraints. 

Beta-reduction constraints lay the foundation for underspecified beta reduc- 
tion, which is needed in the application of CLLS to semantic underspecification 
of natural language mm . Given a CLLS constraint describing many lambda 
terms, the aim is to compute a compact description of all corresponding beta 
normal forms efficiently. In particular, we want to avoid enumerating and in- 
dividually beta-reducing the described lambda terms. (Enumerating is neither 
efficient, nor is its result compact.) A recent proposal towards underspecified beta 
reduction is described by the authors in a follow-up paper 0. 

Solving beta reduction constraints is very much different from higher-order 
unification mu in that CLLS constraints express a-equality rather than afirj- 
equality. CLLS is closely linked to context unification and it can express 

* Supported by the DFG through the Graduiertenkolleg Kognition in Saarbriicken. 

** Supported by the Collaborative Research Center (SFB) 378 of the DFG and the 
Procope project of the DAAD. 
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sharing as in optimal lambda reduction m or calculi with explicit substitutions 
m but can also describe several lambda terms at once. 

Plan. We first recall the definition of the CLLS restricted to dominance and 
A-binding constraints (Sec. El); then we go through two examples to give an idea 
of how one might lift /3-reduction to partial descriptions (Sec.|3). We next define 
/3-reduction constraints (Sec. EJ. Then we define group parallelism constraints 
and show how they can express /3-reduction constraints (Sec. EJ. Finally, we 
present a sound and complete semi-decision procedure for CLLS with group 
parallelism (Sec. EJ and illustrate it with an example (Sec.|3). 

2 CLLS with Dominance and Lambda Binding 
Constraints 

We first introduce A-structures and then a fragment of CLLS for their descrip- 
tion. This fragment contains dominance and A-binding constraints, but not par- 
allelism and anaphoric binding constraints. 

We assume a signature S = of function symbols, each equipped 

with an arity ar(/) > 0. Symbols of arity 0 are constants, written as a, 5, . . . 

A tree 0 is a ground term over E, e.g. g{f{a, b)). A node 
of a tree is identified with a path from the root to this node, 
expressed by a word over the naturals (excluding 0). e is 
the empty path, and 7ri7T2 the concatenation of tti and 7T2. 

7T is a prefix of a path tt' if there is a (possibly empty) tt" 
s.t. tttt" = tt'. The set of all nodes of a tree 9 is defined as 

Dg{f{9i,...,9n)) = eU{iTT\TT e Dg{di), l<i<n} 

A tree 9 can be characterized uniquely by the set Dg of its 
function Lg : Dg — >■ E. 

Now we can consider A-terms as pairs of a tree and 
a binding function that encodes variable binding. We as- 
sume that E contains the symbols var (arity 0, for vari- 
ables), lam (arity 1, for abstraction), and @ (arity 2, for 
application), and quantifier 3 and V (arity 1). The tree 
uses these symbols to reflect the structure of the A-term 
and first-order connectives. The binding function A ex- 
plicitly maps var-labeled nodes to binders. For example, 

Fig.H shows a representation of the term Xx.f{x). Here A(12) = e. Such a pair 
of a tree and a binding function is called a A-structure. 

Definition 1. A A-structure t is a pair (9, A) of a tree 9 and a total binding 
function A : L^^(var) — L^^({lam, 3, V}) such that \{tt) is always a prefix o/tt. 

A A-structure corresponds uniquely to a closed A-term modulo a-renaming. 
We freely consider A-structures as first-order structures with domain Dg. As 




Fig. 1. Tree struc- 
ture for g{f{a,b)) 

nodes and a labeling 



11 

Fig. 2. The A- 
structure of \x.f(x) 




Beta Reduction Constraints 



33 



such, they define relations of labeling, binding, inverse binding, dominance, dis- 
jointness, and inequality of nodes. (Later we will add group parallelism and j3- 
reduction relations.) The labeling relation 7r:/(7Ti, . . . , 7r„) holds in a A-structure 
r if Lq^tt) = /, ar(/) = n and tti = ni for all 1 < z < n. Dominance <* is the 
prefix relation between paths of Dg] inequality ^ is simply inequality of paths; 
disjointness tt-Ltf' holds if neither 7r<i*7r' nor 7r'<]*7r. We will also consider intersec- 
tions, unions, and complements of these relations; for instance, proper dominance 

is <*n yf, and equality = is <* ri[>*. The relation A“^(7ro)={7Ti, . . . , 7r„} states 
that TTi, . . . , 7T„, and only those nodes, are A-bound by tto. Note that an element 
of a set can be mentioned multiply, i.e. {7r,7r} = {tt}. 

Now we can define dominance and binding constraints to talk about A- 
structures as follows; X, Y, Z are variables that denote nodes. 

ip, tp ::= XRY \ X:f{Xi, , X„) \ ip A tp \ false (ar(/) = n) 

I A(X)=y I A-i(Xo)={Xi,...,X„} 

R,R' ::= <* I [>* I T I I RUR' \ RtlR' 

A constraint ip \s a, conjunction of literals (for dominance, labeling, etc). Set 
operators in relation descriptors R 0 are mainly needed for processing pur- 
poses. As above we also use <1“'',= to abbreviate set operators. The one lit- 
eral that has not appeared in the literature before is the inverse binding literal 
A“^(X)={Ai, . . . ,Xn}, which matches the inverse binding relation. 

We will also use first-order formulas <P built over constraints. We write V{'P) 
for the set of variables occurring in <1>. Given a pair (r, tr) of a A-structure r and 
a variable assignment a : Q ^ D^, for some set Q A V((/j), we can associate a 
truth value to ^ in the usual Tarskian sense. We say that (r, tr) satisfies <P iff 
<P evaluates to true under (T,a). In this case, we write (t, cr) ^ <P and say that 
(r, a) is a solution of <1>. 'P is satisfiable iff it has a solution. Entailment P \= P' 
means that all solutions of P are also solutions of P', equivalence P \=^ P' is 
mutual entailment. 

We draw constraints as graphs with the nodes 
representing variables. E.g. Fig. 0 is the graph of 
A-1 (A)={Ai,A2} a A<i*Ai a A<i*A 2 . Labels 
and solid lines indicate labeling literals, while dot- 
ted lines represent dominance. Dashed arrows indi- 
cate the binding relation; disjointness and inequality 
literals are not represented. 

3 Examples 

Before we begin with the formal investigation of beta reduction constraints, we 
first go through two examples which illustrate how beta-reduction can be lifted 
to descriptions of A-structures in CLLS, and why the problem is nontrivial. 

First, consider the left constraint in Fig. El The constraint contains just one 
redex, and it is easy to see how to obtain a description of the reduced formulas. 



AJa 

/ 



Xi vW X2 v^r 



Fig. 3. A constraint graph 
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Fig. 4. Underspecified representations of ‘Every student did not pay attention’ before 
and after beta reduction 



Fig. 5. Representation of ‘Peter and Marc do not sing’, wrong description of the reduct 

We can essentially replace the bound variables with the argument description; 
the result is shown on the right-hand side of Fig. El 

Incidentally, the left constraint in Fig. 0 is an underspecified description of 
the ambiguous sentence Every student didn’t pay attention. Its two readings are 
given by the HOL formulas: 

Vx (stud X — >■ (Ai/-'(payatt y)) x), -tdx (stud x — >■ (Ay payatt y) x). 

(These are the only models of the constraint that do not contain additional 
material not mentioned in the constraint. We ignore this aspect of “solution 
minimality” in this paper and always consider all solutions.) 

The naive replacement approach, however, does not work in general. Fig. 0 
shows an example (which describes the ambiguous sentence ‘Peter and Marc 
do not sing.’) This constraint also describes a /3-redex, this time one where the 
binder binds two variables. Here it is no longer trivial to replace the bound 
variables by the argument description, as we do not know what belongs to the 
argument. There is no useful choice for the part of the constraint that should be 
duplicated; for example, if we decide not to duplicate the negation, we get the 
description on the right-hand side of Fig.El which lacks one solution. Describing 
the reduct using /3-reduction constraints solves this problem; the description is 
correct even if it is not yet known which variables belong to the body and the 
argument of the redex. 

4 Beta Reduction Constraints 

In this section, we add the fd-reduction relation to lambda structures and ex- 
tend the constraint language with (3-reduction constraints to talk about it. The 
/3-reduction relation on nodes of a lambda structure corresponds exactly to tra- 
ditional beta reduction on lambda terms. 
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Stated in the unfolded notation for A-terms we use to build the A-structures 
(with application as an internal label etc.), /3-reduction looks as follows: 

C{@{Xx.B,A)) C{B[x/A\) a; free for A 

We call the left-hand side the reducing tree, the right- 
hand side the reduct of the /3-reduction. We call C the 
context, B the body, and A the argument of the reduction 
step. 

Now an important notion throughout the paper are tree 
segments. Intuitively, a tree segment is subtree which may 
itself be missing some subtrees (see Fig. El- The context, 
body, and argument of a beta reduction step will all be tree 
segments. 

Definition 2. A tree segment a of a X-structure r is given by a tuple 
tto/tti , . . . , 7T„ of nodes in Dr, such that r |= 7To<i*7ri and t |= 7ri(_LU=)7r_,- for 
^ < i < j ^ n. The node r(a) = ttq is called the root, and hs{a) = tti, . . . , 7t„ is 
the sequence of holes of a. If n = 0 we write a = ttq/. The nodes between the 
root r{a) and the holes hs{a) are defined as 

b(o;) =df {tt C Dr I r{a)<*7T A 7r'-i<i“''7r} 

7T^G{^s(a)} 

To exempt the holes of the segment, we define b^(a) =df b(a) — {/is(a)}. 

Definitions. A correspondence function between tree segments a, (3 in a 
lambda structure t is a bijective mapping c : b(o:) — >■ b(/3) which satisfies for 
all nodes tti, . . . , 7 t„ of t: 

1. The roots correspond: c(r(a)) = r(/3) 

2. The sequences of holes correspond: 

hs{a) = TTl, . . . , 7 T„ hs(/3) = c( 7 Tl), . . . , c( 7 T„) 

3. Labels and children correspond within the proper segments. For tt C b~(o!) 

and label f: 

7r:/(7Ti, . . . , 7T„) c(7r):/(c(7ri), . . . c(7t„)). 

We next define the /3-reduction relation on A-structures to be a relation 
between nodes in the same A-structure. This allows us to see the /3-reduction 
relation as a conservative extension of the existing A-structures. The representa- 
tions both of the reducing and reduced term are part of same big A-structure — 
in Fig. [71 these are the trees rooted by r{j) and r( 7 ') respectively. 

A redex in a lambda structure is a sequence of segments ( 7 , f3, a) of that 
A-structure that are connected by nodes 7 ro, 7 Ti with the following properties. 

hs{'y) = TTo, 7 ro:@( 7 ri,r(a)), 7 ri:lam(r(/ 3 )), and A“’^( 7 Ti) = {hs{/3)} 




Fig. 6. The tree 
segment tto/tti , 7T2 



36 



M. Bodirsky et al. 




Fig. 7. The beta reduction relation for a binary redex 

The lambda structure in Fig. □ contains a redex (7, /3, a) and also its reduct 
(7', /?', . There, corresponding segments (7 to 7', (3 to f}' , a to both a\ 

and a' 2 ) have the same structure. 

Definition 4 (Beta Reduction Relation). Let r be a A-structure. Then 

(7,/3,o) (7',/3',ai,...,a'„) 

holds in r iff first, (7,/?, a) form a redex. Second, there are correspondence 
functions between 7, 7', between /?, /3' and c), between a, a' (for 1 < f < n), 
such that for each S, S' among these segment pairs with correspondence function 
c between them and for each tt £ b"((5), the following conditions hold: 

1. for a var-labeled node bound in the same segment, the correspondent is 
bound by the c-corresponding binder node. 

A(7t) £ b^(i5) => A(c(7t)) = c(A(7t)) 

2. for a var-labeled node bound in the context 7, the correspondent is bound 
by the c.y-corresponding binder node. 

A(7t) £ b^(7) ^ A(c(7t)) = c^(A(7t)) 

3. for a var-labeled node bound above the reducing tree, the corresponding node 
is bound at the same place: 

A(7t) ^ b(r( 7 )/) ^ A(c(7t)) = A(7t) 

The /3-reduction relation on A-structures models /3-reduction on A-terms 
faithfully. This even holds for A-terms with global variables, although A- 
structures can only model closed A-terms. Global variables correspond to var- 
labeled nodes that are bound in the surrounding tree, i.e. above the root node 
of the context of the redex. Rule □ of Def.EI thus ensures a proper treatment of 
global variables. 

Capturing in /3-reduction is usually avoided by a freeness condition. For in- 
stance, one cannot simply /3-reduce {Xx.Xy.x)y without renaming the bound oc- 
currence of y beforehand. Otherwise, the global variable y in the argument gets 
captured by the binder Xy. The following proposition states that the analogous 
problem can never arise with the /3-reduction relation on A-structures. 
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Proposition 1 (No Capturing). Global variables in the argument are never 
eaptured by a X-binder in the body: with the notation of Def. ^ this means that 
no var-labeled node in b(a') is bound by a \am-labeled node in 

Proof. Assume there exists a node tt' in b(a') such that A( 7 t') S b^(/3'). There 
must be a corresponding var-labeled node tt with c)^( 7 r) = tt', which is bound 
either in a, in 7 or outside the reducing tree. In the first case property (1) leads 
to a contradiction, in the second case property (2), and in the third case (3). 

The /3-reduction relation conservatively extends A-structures. We extend our 
constraint syntax similarly, by fS-reduction literals, which are interpreted by the 
/3-reduction relation. Let a segment term A, B, C be given by the following 
abstract syntax: 

A,B,C=df Ao/Ai,...,A„ 

Then /3-reduction literals have the following form: 

5 Group Parallelism 

In this section, we extend dominance and binding constraints with group paral- 
lelism eonstraints (Def. 0), a generalization of the parallelism constraints found 
in CLLS |EE1. Then we show that CLLS with group parallelism can express 
/3-reduction constraints (Thm. 0). 

Group parallelism relates two groups, i.e. sequences of tree segments. It re- 
quires that corresponding entries in the two sequences must have the same tree 
structures, and binding in the two groups must be parallel. The following defi- 
nition makes this precise; all conditions but the last are illustrated in Fig.0 

Definition 5. The group parallelism relation ^ of a X-structure t is the greatest 
symmetric relation between groups of the same size such that 

(ai,...,a„) -- {a[,...,a'^) 

implies there are correspondence functions Ck ■ b(afc) — >■ b(o;(,) for all 1 < k < n 
that satisfy the following properties for all 1 < i, j < n and tt G h~(ai): 

(same.seg) for a var-labeled node bound in the same segment, the correspond- 
ing node is bound correspondingly: 

X{tt) G b^(oi) ^ A(Cj(7t)) = Ci(A(7r)) 

(diff.seg) for a var-labeled node bound outside ai but inside aj, the corre- 
spondent is bound at the corresponding place with respect to Cj: 



A(7t) G b (oj ) A A(7t) ^ b (oj) ^ A(ci(7r)) = Cj(A(7r)) 
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(same.seg) (a, a) ~ (a',^) 



2(1” " 


/A 


A )":A 


c. 


iS'A 


A'' 


A 


var 


var 


var 


var 


(diff.seg) (a,0) ' 


-(o', /S') 


(outside) (q,;9) 


-(a',/?) 



Fig. 8. Possible bindings in a group parallelism 



(outside) corresponding \/ar-labeled nodes with binders outside the group seg- 
ments are bound by the same binder node: 

A(7t) ^ => A(ci(7r)) = A(7 t) 

(hang) there are no hanging binders: 

A"^(7t) C U^^ib'(ofc) 

On the syntactic side, we extend CLLS by group parallelism literals that 
are interpreted by the group parallelism relation. Let Ai, . . . Am, A'l,. ■ ■ , A'm be 
segment terms, then group parallelism literals have the form 

(Ai,...,Am) ~ 

Group parallelism extends ordinary parallelism constraints m, which are 
simply the special case for groups of size one. This extension is proper; ordinary 
parallelism constraints cannot handle the case where a node is bound in a dif- 
ferent segment of the group, as illustrated in the (diff.seg) part of Fig. B From 
the perspective of ordinary parallelism, the node is bound outside the parallel 
segment, and thus the (outside) condition applies, and the corresponding node 
must be bound by the same binder. 

Another interesting observation in Fig. 0 is that the conditions (same.seg) 
and (diff.seg) must be mutually exclusive. If (diff.seg) was applicable in the 
leftmost case, it would enforce A(ci(7t)) = C2(A(7t)), which is clearly wrong. 

Now we show how to encode beta reduction constraints in CLLS. First, 
we define the following abbreviation to express that the segment term A = 
Xq/Xi, . . . , Xn indeed denotes a tree segment: 

seg(A) =df f\^^iXQ<f Xi A Ai<i<j<„A'i(_LU=)Aj 

Using this, we can axiomatize a redex in CLLS. For segment terms A = X2/ , 
B = A3/A4, ... ,Xn,C = X'/Xo, we set: 



redexxo,Xi(C', B, A) =df seg(A) A seg(B) A seg(C) A Aq:@(Ai, A2) 
A Ai:lam(A3) A A-i(Ai) = {X4,... ,A„} 
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Theorem 1. Beta reduction constraints can be expressed in CLLS with group 
parallelism via the following equivalence: 

{C,B,A)^^ H 3Xo,Xi-.rede>iXo,xAC,B,A) 

A iC,B,A,...,A)^{C\B',A',...,A') 

Proof. We will check the two-side entailment separately, first from right to left. 
Let cr be a variable assignment into some A-structure that solves the right hand 
side. Properties (same.seg), (diff.seg), and (outside) for group parallelism (Def. 
El then subsume the corresponding properties of /3-reduction (Def. ^ . 

For the other direction, let (r, cr') solve the beta-reduction literal on 
the left hand side. According to (Def. 01 there exists a redex (7,/3, a) 
in r with nodes 7ro,7Ti as in Sec. 0 Let tr be the variable assignment 
ct'[ 7 To/Ao, TTi/Xi, a/A, /3/B, 7/C]. It remains to check that (r, a) solves the group 
parallelism literal on the right hand side. 

We consider the symmetric relation Ri which relates the group 
{a(C),a{B),a{A),... ,a{A)) to (cr(C'), ct(B'), cr(A(), . . . , ct(A'„)) and con- 
versely. We show that ~ satisfies all conditions in the definition of group par- 
allelism (Def. El, which means that is subsumed by the group parallelism 
relation 

First of all, both above groups satisfy condition (hang). This is clear for the 
group (cr(C"), cr(i3'), cr(A(), . . . , cr(A(^)), which covers the complete subtree below 
r{a{C')). A similar argument applies to {a{C),a{B),a{A), . . . ,a{A)), which 
covers the whole tree below r(cr(C')) except the ©-labeled node ttq, the lam- 
labeled node 7Ti and the var-labeled nodes hs{a{B)). But these var-labeled nodes 
are bound by tti. 

By Def. 01 there exist correspondence functions c-y between segments cr(C), 
cr(C'), cp between cr{B), a{B') and between ct(A),ct(A') for 1 < / < n. Since 
Ri is symmetric, we have to check properties (same.seg), (diff.seg), and (outside) 
for group parallelism (Def. El for the correspondence functions and their inverse 
functions. 

We only show the particularly interesting property (diff.seg) for a corre- 
spondence function (c(,)“^ with 1 < / < n. Let tt' be a var-labeled node in 
b^(cr(A')), and A(7 t') ^ b^(cr(A')). There are three cases: A(7 t') S b^((r(C")), or 
A(7 t') G b^(cr(i3')), or A(7 t') G b^(cr(A')) for some 1 < j < n. The second case 
is impossible by Proposition Q The third case is impossible as the holes of the 
segment (j{B') are disjoint or equal (Def. 13). We can thus concentrate on the 
first case. Let tt be the corresponding node of tt'. i.e. c(,(7t) = tt'. The node tt has 
to be var-labeled by Def El Properties (same.seg) and (outside) of Def. 0| yield 
A(7 t) G cr(C) (some computation is needed here). Thus, Property 0of Def. 0]can 
be applied. It implies A(7 t') = c.y(A(7r)), i.e. c“^(A( 7 t')) = A(c“^( 7 t')) as required. 

6 Solving Group Parallelism Constraints 

We now turn to a sound and complete semi-decision procedure for CLLS with 
group parallelism, which thus solves /3-reduction constraints. To keep the pre- 
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(D.clash.ineq) X=Y A X^Y false 

(D.dom.trans) X<*Y A Y<i*Z X<s*Z 

(D.lab.ineq) X:f{...)AY:g{...)^X^Y where / ^ g 

(D.lab.dom) X:f{...,Y,...)^X<i+Y 

(D.distr.notDisj) X<i*Z A Y<}*Z X<3*Y V Y<]*X 

(D.distr.child) X<l*Y A X-.f{Xi, , Xm) Y=X V V™ i Xi<i*Y 



Fig. 9. Saturation rules for dominauce constraints 



sentation readable, we focus on the most relevant rules. The full procedure is 
given in |3|. An illustrative example follows in Section 0 

The procedure is obtained by extending an existing semi-decision procedure 
for CLLS 0 that is based on saturation. A constraint is freely identified with 
the set of its literals. Starting with a set of literals, more literals are added 
according to some saturation rules. Our saturation rules are implications of the 
form ifo — >■ for some n > 1. To write down rules more compactly, we will 

also use arbitrary positive existential formulas on the left hand side. These can 
be eliminated in a preprocessing step: 3-quantified variables can be replaced by 
arbitrary variables, and disjunction is eliminated by explosion into several rules. 

A saturation rule of the above form is applicable to a constraint ip if ipo is 
contained in ip, but none of the ipi is. A rule ip ^ <P\s sound \iip \= d>. Apart from 
that, we have saturation rules of the form ip^ — >■ 3Xpi, which introduce fresh 
variables. Such a rule is applicable to ip if ipo is in ip, but p\, modulo renaming of 
X, is not. Given a set S of saturation rules, we call a constraint saturated (under 
S) if no further rule of S applies to it. We say that a constraint is in S- solved 
form if it is saturated under S and clash- free (i.e. it does not contain false). 

Fig. El contains an (incomplete) set of saturation rules for dealing with dom- 
inance constraints (the constraints of Sec. El without binding). A more complete 
collection including a treatment of set operators can be found in . To deal 

with parallelism, we first introduce some formulas that describe membership in 
(proper) segments and groups. 

Xeh{A) =df Ao<*X A A”=i A(<*UT)W, 

X€h-(A) =df AGb(A) A A”=i AAA, 

X^b~{A) =df A(<+UT)Ao V V”=i A,<*X 
XGh{Ai, . . . , Am) =df Vi=i Agb(Ai) 

AGb-(Ai,... ,Am) =dfV“iA€b-(A,) 

Note that the terms b(A), b^(A), b(Ai, . . . ,Am) are not given any formal 
meaning, even though it would be correct to interpret them as the corresponding 
sets of nodes. 
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(P.symm) A B ^ B A 

(P.init) A ~ _B — >■ seg(Ai) A co{Ai^ Bi){X^^)=Y? where 1 < i < n, Ai = 

X^/X}, ... ,Xp,Bi = XP/y/, ... ,Yp, undo <j <rrii 

(P.new) A B f\ U S b(Ai) — > 3 V co(Ai, Bi)(U)=V where V fresh, 1 < i < n 

(P.copy.lab) A"o™(A,B)(Xi)=yi A Xo:/(Xi, . . . ,Xm) A Xoeb~(A) ^ 
yo:/(yi,... ,Ym) 

(P.copy.dom) U1RU2 A ALi co(vi, B){Ui)=Vi ^ Vi RV2 
(P.distr.eq) ip X=Y V Xj^Y for X, Y S V(p) 

Fig. 10. Saturation rules, where A — Ai, . . . , An and B = B\, . . . , Bn 



We also want to be able to speak about correspondence functions. So we 
extend our constraint language by auxiliary literals 

ip ::= ...\co{A,B){X)=Y 

where A and B are segment terms for segments with the same number of holes. 
Such a literal states that A and B are parallel within some group parallelism, 
that X^h{A) and Y^h{B), and that X corresponds to Y with respect to the 
correspondence function for A and B. We introduce two more formulas. Let 
A = {Ai, . . . ,An), B = {Bi,. . . ,Bn), and 1 < fc < n. 

co~ {A, B){X)=Y =df co{A,B){X)=Y A Xeh~{A) 

CO- (A, B){X)=Y =dfA^B A CO- {Ak,Bk){X)=Y 

The second lets us talk about correspondence functions for a group paral- 
lelism, picking out the fc-th correspondence function. In that respect, co'j^{A,B) 
matches the Ck of Def. 0 (except that co^ {A, B){X)=Y additionally demands 
XGh-{Ak) for convenience). 

The main rules for handling parallelism are given in Fig. A complete set 
can be found in 0BO|. The rules (P.init) and (P.new) introduce correspondence 
literals; between them, they state that each node in a parallel segment needs 
to have a correspondent. (P.init) states that in a correspondence function, root 
corresponds to root, and hole to hole, while (P.new) is responsible for all other 
nodes. (P.copy.dom) and (P.copy.lab) between them ascertain the structural iso- 
morphism that Def. El demands for a correspondence function. 

Fig. in shows saturation rules for the interaction of group parallelism and 
lambda binding. The first four rule schemata directly express the conditions 
of Def. 0 The rules (L.distr.gr.l) and (L. distr.gr. 2) decide, loosely speaking, 
whether variables occurring in a labeling literal belong to some segment of a 
group or not. This is necessary because we need to know which of the schemata 
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(L.same.seg), (L.diff.seg), (L. outside) and (L.hang) is applicable. This is ex- 
pressed by using the following formula, where A = (Ai, . . . , An): 

n 

distrj(t/) =df f\{U&h-{Ai) V U^h-{Ai). 

Finally, (L. inverse) deals with the copying of literals. This is necessary 
if we want to perform a second beta reduction step, where we need the 
information again. The schema uses two more formulas. The first one is simple: 

X{X)^Y =df 3Z{X{X)=Z A Z^Y) 

The second formula collects, for a finite set of variables, all correspondents 
with respect to A ^ B. Let 81,82 stand for finite sets of variables, and let 
A = Al . . . , An- 

CO- (A B) { 8 i )=82 =df Ar=i Axes, (A,) V Wes. co" (3, S) {X)=Y) 

^ AvgS2 VxgSi Vi=i cOj (A, B)(X)=F 

So (L. inverse) collects all correspondents of all variables bound by X; for each 
of these correspondents it must be known whether it is bound by Y or definitely 
bound by something else. Then we can determine A“^(F). The soundness of this 
rule is not obvious: is it really sufficient to look among the correspondents of 
A“^(X) to compute A“^(y)? The following proposition shows that it is. 

Proposition 2 (Inverse lambda binding). 8uppose (oi, . . . , a„) ^ {a'l, . . . , 
a'n) holds with correspondence functions ci, . . . ,c„. Then for alll <k <n and 

all 7T G b-(ofc), 



A Acfc(7r)) C (JIcAtt') I 7t' G A Att) n b (oi)} 
2=1 



Proof. Let oj G A-^(cfe(7r)). The ”no hanging binders” condition (hang) of Def. 
Elis critical here: it enforces oj G UAi b~(o:'). If w G then there exists 

some 7t' G b~(afc) with Cfc(Tr') = w. tt' is var-labeled by Def. 0and has a binder 
since A is total. So we must have A(cfe(7r')) = Cfe(A(7r')) by condition (same.seg) 
of Def. 0 Now A(cfe(7r')) = Cfe(7r) and Ck is a bijection, so tt' G A-^(7t). If, on the 
other hand, w ^ but w G b-(a'), there is again a tt' with Cj(7r') = to, 

and A(cj(7r')) = Cfc(A(7r')) by condition (diff.seg), so again tt' G A-^(7t). 

The rules we have presented are part of a sound and complete semi-decision 
procedure for group parallelism constraints given in 0 ■ The omitted rules state 
additional properties of dominance constraints, ensure that correspondence func- 
tions are indeed bijective functions, and regulate the interaction between differ- 
ent correspondence functions. 
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(L.same.seg) X(Ui)=U2 A co~ (A, B){Ui)=Vi X{Vi)=V2 
(L.diff.seg) X{Ui)=U2 A A!^iCo-^(A,B){Ui)=Vi A U2^b~ (At^) ^ A(Vi)=V2 
(L.outside) X(U)=Y A co~ (A, B)(U)=V A Y^b~(A) X{V)=Y 
(L.hang) X{lh)=U2 A A~S A U2&b~ (A) Ui&b~ (A) 

(L.distr.l) X{Ui)=U2 A A~S A Cieb-(A) distrj(C/2) 

(L.distr.2) X{Ui)=U2 A A^S A C2 S b-(A) distr^(C/i) 

(L.equal) A(Xi)=X2 A ALi ^ A(Yi)=y2 

(L. inverse) A-i(X)=Si A co^ (A, B)(X)=Y A co" (A, :b)( 5 i )=52 U 5a A 
Av 6S, KV)=Y a AveSa ^ A-1(Y)=S2 

Fig. 11. Lambda binding rules for group parallelism 



{C,B,A,A) ~ (C',B',A',A") 
with C = XjXo , 

C = X'/X'o , 

B = Xt!Xi,X2 , 

B' = XyX[,X'^ , 

A' = XV, and A" = X^ . 

X-\X,) = {Xi,X2}, A-i(Fr) = m. 



X « 




la^ Yi 
Y2 



Fig. 12. A group parallelism constraint encoding a non-linear beta reduction step 



Theorem 2. There exists a saturation procedure GP which encompasses all in- 
stances of the rule schemata in Fig. ^3 m such that each rule of GP is 
correct, and each GP-solved form of a constraint ip is satisfiable (soundness), 
and for every solution (r, a) of ip, GP computes a GP-solved form of p of which 
(t, a) is a solution (completeness) . 

Proving that GP-solved forms are satisfiable can be done by constructing a 
model and variable assignment explicitly. One then has to check that all literals 
are indeed satisfied, which requires a tedious case distinction. Proving complete- 
ness is nontrivial as well, but can be done along the lines of 0. It is largely 
independent of the particularities of the rule system we employ. 

7 The Procedure in Action 

We illustrate the procedure of the previous section by solving the constraint in 
Fig. [O It contains a non-linear lambda redex at (G, B, A) (similarly to Fig. El 
and a lambda binder at Y\ which can either belong to the context C or argument 
A. The group parallelism constraint {C,B,A,A)^{C',B',A',A”) describes a 
beta-reduction step for the redex {C,B,A). 

A record of the solving steps is given in Fig. El and El We only comment on 
the main steps. In step (4), we have Yi<* Z A Xq<*Z, and as trees do not branch 



44 



M. Bodirsky et al. 



(1) 


Yi^Xo 




(D.lab.ineq) 




(2) 


Yi<+Y2,Xo<+Xa. 




(D.lab.dom) 




(3) 


Yi<l*Z, Xo<l*Z 




(D.dom.trans) 




(4) 


XoO'Yi V Yi<l*Xo 




(D.distr. not Disj) 




(4a) 


XoO'Yi: 




(4b) Yi<’Xo: 




(5) Xo=Yi V Xf<i*Yi V Xa< 


*Yi (D.distr. child) 


(7) Yi=Xa V Y2<I*Xo 


(D.distr. child) 


(5a) 


Xo=Yi: 


(5b) Xe<~Yi-. 


(7a) Fi=Xo : 






both lead to false 




(8) false 


(D. clash. ineq) 


(5c) 


XaO'Yi: 




(7b) Y2<’Xo: 




(9) 


co{A,A'){XA=X[ 


(P.init) 






(10) 


co{A,A')(YA=Yl, 

co{A,A'){Y2)=YI 

co(A,A')(Z)=Z' 


(P.new) 


(17) co{C,C'){X)=X' , 


(P.init) 


(11) 


X[<*Y{,Y^<*Z' 


(P.copy.dom) 


co(C,C')(Xo)=X^ 




(12) 


Y{:lam(Y^) 


(P.copy.lab) 


(18) co(C,C')(Yi)=Y', 


(P.new) 


(13) 


co(A,A")(Xa)=X^ 


(P.init) 


co(c,c')(rA=y^ 




(14) 


co(A,A")(Yi)=Y'\ 


(P.new) 


(19) X'<i*Y/,y2'<l*^o 


(P.copy.dom) 




co(A,A")(Y2)=Y^', 

co(A,A”)lz)=Z" 




( 20 ) Y/:lam(Y 2 ') 


(P.copy.lab) 


(15) 


X'^<i*Yl',Y^'<i*Z" 


(P.copy.dom) 






(16) 


Yj":lam(y2") 


(P.copy.lab) 







Fig. 13. Solving the group parallelism constraint in Fig. 1121 



Continuing (5c) 

(21) A(Z') = y/, A(Z") = y" (L.same.seg) 

(22) Z^b~(B) V Z€ b~(B) (L.distr.2) 


(22a) Z^b-(B) 

(23) Z^b~(C) V Zg b~(C) (L.distr.2) 


(22b) Z^b-{B) 

false 


(23a) Z^b~(C) 

(24) yiVy" V Yi^Yi' (P.distr.eq) 


(23b) ^€b-(C) 

false 


(24a) yiVy" 

(25) A-i(Z") 

(26) A ^(y”) = {Z'j (L. inverse) 

(27) A"i(yi") = {Z”] (L.inverse) 


(24b) Yl=Y” 

false 



Fig. 14. Inverse Binding in case (5c) 



upwards, one of Yi,Xq must dominate the other. This step effectively guesses 
whether Yi, Y 2 are in C or in A. With choice (5c), we make two copies of Yi and 
Y 2 each. This is because A is parallel both to A' and A": because binds two 
variables, the argument is copied twice. On the other hand, with (7b) Yi and Y 2 
are only copied once: they belong to the context C, which is parallel only to C". 

In Fig. rni we continue case (5c) of Fig. [El applying the lambda binding rules. 
All steps from (22) on prepare the determination of A“^(Y]') and A“^(Y") in (25) 
and (26). We know \~^{Yi)={Z}. Steps (22) and (23) determine S 2 U to be 
{Y(,Y('} for both (25) and (26). After (24) we know \{Z')y^Y(' and \{Z")^Y(, 
so we have all we need to infer the correct A“^ information in the last steps. 
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8 Conclusion and Future Work 

We have introduced beta reduction constraints and have presented a semi- 
decision procedure for processing them, in three steps: First, we have extended 
CLLS by group parallelism constraints. Second, we have expressed /3-reduction 
constraints in this extension of CLLS. Third, we have lifted a known semi- 
decision procedure for CLLS to also deal with group parallelism constraints. 
It is an open question to what extent beta reduction constraints can conversely 
express parallelism. 

This gives us a framework for investigating the more general problem of 
underspecified beta reduction Pj: How can we string together several reduction 
steps, as described by beta reduction constraints, until we arrive at descriptions 
of normal forms? In this broader setting, we can investigate properties such 
as confluence and termination on the underspecified level. Another problem, 
motivated by the application, is to modify the saturation procedure to perform 
as few case distinctions as possible during underspecified beta reduction. Finally, 
it will be interesting to find an efficient implementation of this operation, possibly 
employing concepts such as sharing and constraint programming. 
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Abstract. We show how higher-order rewriting may be encoded into 
first-order rewriting modulo an equational theory S. We obtain a char- 
acterization of the class of higher-order rewriting systems which can be 
encoded by first-order rewriting modulo an empty theory (that is, S = ^). 

This class includes of course the A-calculus. Our technique does not rely 
on a particular substitution calculus but on a set of abstract properties 
to be verified by the substitution calculus used in the translation. 

1 Introduction 

Higher-order substitution is a complex operation that consists in the replacement 
of variables by terms in the context of languages having variable bindings. These 
bound variables can be annotated by de Bruijn indices so that the renaming op- 
eration (a-conversion) which is necessary to carry out higher-order substitution 
can be avoided. However, substitution is still a complicated notion, which can- 
not be expressed by simple replacement (a.k.a. grafting) of variables as is done 
in first-order theories. To solve this problem, many researchers became inter- 
ested in the formalization of higher-order substitution by explicit substitutions, 
so that higher-order systems/formalisms could be expressible in first-order sys- 
tems/formalisms: the notion of variable binding is dropped because substitution 
becomes replacement. A well-known example of the combination of de Bruijn 
indices and explicit substitutions is the formulation of different first-order cal- 
culi for the A-calculus [ 11411 711 , 510 IZ'ZIJ . which is the paradigmatic example of a 
higher-order (term) rewriting system. Other examples are the translations of 
higher-order unification to first-order unification modulo higher-order logic 
to first-order logic modulo m, higher-order theorem proving to first-order the- 
orem proving modulo [HI, etc. 

All these translations have a theoretical interest because the expressive power 
of higher and first-order formalisms is put in evidence, but another practical issue 
arises, that of the possibility of transferring results developed in the first-order 
framework to the higher-order one. 

The goal of this paper is to give a translation of higher-order rewrite systems 
{HORS) to a first-order formalism. As a consequence, properties and techniques 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 47-1!?^ 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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developed for first-order rewriting could be exported to the higher-order formal- 
ism. For example, techniques concerning confluence, termination, completion, 
evaluation strategies, etc. This is very interesting since, on one hand it is still 
not clear how to transfer techniques such as dependency pairs j2], semantic la- 
belling m or completion P| to the higher-order framework, on the other hand, 
termination techniques such as RPO for higher-order systems m turn out to 
be much more complicated than their respective first-order versions izcg. 

The main difficulty encountered in our translation can be intuitively ex- 
plained by the fact that in higher-order rewriting a metavariable occurring on 
the right-hand side in a higher-order rewrite rule may occur in a different bind- 
ing context on the left-hand side: for example, in the usual presentation of the 
extensional rule for functional types in A-calculus (r]):Xa.app{X,a) — > X, the 
occurrence of X on the right-hand side of the rule, which does not appear in any 
binding context, is related to the occurrence of X on the left-hand side, which 
appears inside a binding context. The immediate consequence of this fact is that 
the first-order translation of the rule (rj) cannot be defined as the naive transla- 
tion taking both sides of the rule independently. This would give the first-order 
rule X{app{X, 1)) — >■ X, which does not reflect the intended semantics and hence 
the translation would be incorrect. 

As mentioned before, the need for (a)-conversion immediately disappears 
when de Bruijn notation is considered. Following the example recently intro- 
duced, one can express the (ry)-rule in a higher-order de Bruijn setting, such 
as for example the SERSdb (Simplified Expression Reduction Systems) formal- 
ism 0, by the rule {rids)'- X{app{Xa,X)) — > The notation used to translate 

the metavariable X into the de Bruijn formalism enforces the fact that the oc- 
currence of X on the right-hand side of the rule ( 77 ) does not appear in a binding 
context, so it is translated as where e represents an empty binding path, while 
the occurrence of X on the left-hand side of the rule appears inside a binding 
context, so it is translated as X^, where a represents a binding path of length 1 
surrounding the metavariable X. 

Now, the term A(opp(3, 1)) reduces to 2 via the [j]dB) rule. In an explicit 
substitution setting, we have the alternative formulation: 

(j]fo) ■■ x{app{x[t],i )) — ^ a: 

However, in order for the metaterm X[f] to match the subterm 3 first-order 
matching no longer suffices: we need ^-matching, that is, matching modulo an 
equational theory £. For an appropriate substitution calculus £ we would need 
to solve the equation 3 =£ X [f] . Equivalently, we could make use of the theory of 
conditional rewriting: X{app{Y,l )) — >■ X where Y =£ ^A[t]. Another less evident 
example is given by a commutation rule as for example 

(C) : Implies(3a.y f3.X,y f3.3a.X) — > true 

which expresses that the formula appearing as the first argument of the Implies 
function symbol implies the one in the second argument. The naive translation to 
first-order Implies{3(y{X)),\/{3{X ))) — >• true is evidently not correct, so that 
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we take its translation in the de Bruijn higher-order formalism SERSdb and then 
translate it to first-order via the conversion presented in this work obtaining: 

(C/o) : /mpHes(3(V(X)), V(3(X[2 • 1 • id]))) — true 

Now, the rule (C/o) has exactly the same semantics as the original higher-order 
rule ((7). This difficulty does not seem to appear in other problems dealing with 
translations from higher-order to first-order recently mentioned. 

In this work we shall see how higher-order rewriting may be systematically 
reduced to first-order rewriting modulo an equational theory £. To do this, we 
choose to work with Expression Reduction Systems m, and in particular, with 
de Bruijn based SERSdb as defined in jS] which facilitates the translation of 
higher-order systems to first-order ones. However, we claim that the same trans- 
lation could be applied to other higher-order rewriting formalisms existing in 
the literature. We obtain a characterization of the class of SERSdb (including 
A-calculus) for which a translation to a full {£ = 0) first-order rewrite system 
exists. Thus the result mentioned above on the A-calculus becomes a particular 
case of our work. 

To the best of our knowledge there is just one formalism, called XRS m, 
which studies higher-order rewrite formalisms based on de Bruijn index notation 
and explicit substitutions. The formalism XRS, which is a first-order formalism, 
is presented as a generalization of the first-order cr^-calculus to higher-order 
rewriting and not as a first-order formulation of higher-order rewriting. As a con- 
sequence, many well-known higher-order rewriting systems cannot be expressed 
in such a formalism jSj. Not only do we provide a first-order presentation of 
higher-order rewriting, but we do not attach to the translation any particular 
substitution calculus. Instead, we have chosen to work with an abstract formu- 
lation of substitution calculi, as done for example in im to deal with confluence 
proofs of A-calculi with explicit substitutions. As a consequence, the method 
we propose can be put to work in the presence of different calculi of explicit 
substitution such as cr a cni, ■u 0, / Id, d Id. s Id. X nil- 

The paper is organized as follows. Section El recalls the formalism of higher- 
order rewriting with de Bruijn indices defined in pj, and Section 0 defines a 
first-order syntax which will be used as target calculus in the conversion proce- 
dure given in Section E] Properties of the conversion procedure are studied in 
Section 14. 1 1 the conversion is a translation from higher-order rewriting to first- 
order rewriting modulo, the translation is conservative and finally we give the 
syntactical criterion to be used in order to decide if a given higher-order sys- 
tem can be translated into a full first-order one (a first-order system modulo an 
empty theory). We conclude in Section 0 

Due to lack of space proofs are omitted and only the main results are stated. 
For further details the reader is referred to the full version accessible by ftp at 
f tp : //f tp . Iri . f r/LRI/art icles/kesner/ho-to-f o . ps . gz. 
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2 The Higher-Order Framework 

We briefly recall here the de Bruijn indices based higher-order rewrite formal- 
ism called Simplified Expression Reduction Systems with de Bruijn Indices 
(SERSdb) which was introduced in |^. In full precision we shall work with 
SERSdb without i(ndex)-metavariables (c.f. full version for details). 

Definition 1 (Labels). A label is a finite sequence of symbols of an alphabet. 
We shall use k,l,li,... to denote arbitrary labels and e for the empty label. If 
a is a symbol and I is a label then a G I means that the symbol a appears in 
the label 1. Other notations are |Z| for the length of I and a.t{l,n) for the n-th 
element of I assuming n < |Z|. Also, if a occurs (at least once) in I then pos(a, 1) 
denotes the position of the first occurrence of a in 1. A simple label is a label 
without repeated symbols. 



Definition 2 (de Bruijn signature). Consider the denumerable and disjoint 
infinite sets: 

— {ai, 02 , 03 , ...} a set of symbols called binder indicators, denoted a, /3, . . ., 

— {X) , Xf , Xi , . . .} a set o/ t-metavariables (t for term), where I ranges over 
the set of labels built over binder indicators, denoted Xi, Yi, Zi, . . ., 

~ {/i, / 2 , /s, • ■ •} a set o/ function symbols equipped with a fixed (possibly zero) 
arity, denoted f, g,h, . . ., 

— {Ai, A 2 , A 3 , . . .} a set o/binder symbols equipped with a fixed (non-zero) arity, 
denoted A, /i, z/, 

Definition 3 (de Bruijn pre-metaterms). The set of de Bruijn pre-meta- 
terms, denoted VAiTdb, is defined by the following two-sorted grammar: 

metaindices I ::= 1 | S(/) 

pre-metaterms A ::= I \ Xi \ f{A , . . . , A) | ^(A , . . . ,A) \ A|A] 

We shall use A, B, A^, . . . to denote de Bruijn pre-metaterms. The symbol 
.|.] is called de Bruijn metasubstitution operator. We assume the convention 
that S°(l) = 1 and S^+^(n) = S(S-^(n)). As usually done for indices, we shall 
abbreviate S-’“^(l) as j. 

We use MVar{A) to denote the set of metavariables of the de Bruijn pre- 
metaterm A. An X-based t-metavariable is a t-metavariable of the form Xi 
for some label I, we say in that case that X is the name of X[. We shall use 
NMVar{A) to denote the set of names of metavariables in A. In order to say 
that a t-metavariable Xi occurs in a pre-metaterm A we write Xi G A. 

We shall need the notion of metaterm (well-formed pre-metaterm). The first 
motivation is to guarantee that labels of t-metavariables are correct w.r.t the 
context in which they appear, the second one is to ensure that indices like S®(1) 
correspond to bound variables. Indeed, pre-metaterms like (^{Xap) and ^(^(4)) 
shall not make sense for us, and hence shall not be considered well-formed. Well- 
formed pre-metaterms shall be used to describe rewrite rules. 
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Definition 4 (de Bruijn metaterms). A pre-metaterm A G VAATdb is said 
to be a metaterm iff the predicate WT{A) holds where 'WT{A) =def WTf^{A) 
and WTi{A) is defined as follows: 

— WTiiS^il)) tffj + l< |Z| 

— WJ-i{Xk) iff I = k and I is a simple label 

— WJ-i{f{Ai , . . . , An)) iff for all 1 < i < n we have WTi{Ai) 

— . . . ,An)) iff there exists a ^ I s.t. for all 1 < i < n, WTaiiyAi) 

— WiF;(Ai|A 2 ]) iff WJ-i{A 2 ) and there exists a ^ I such that yVJ-ai{Ai) 



Example 1. Pre-metaterms A(Y'/ 3 a, 2)) and g(A(^(/i))) are metaterms, 

while /(I, 5 (X/ 3 )), A(^(Xq,q,)) and f_{Xa,Xfj) (with a 13) are not. 



Definition 5 (de Bruijn terms and de Bruijn contexts). The set of de 

Bruijn terms, denoted Tdb, eind the set of de Bruijn contexts are defined by: 

de Bruijn indices n ::= 1 | S(n) 

de Bruijn terms a ::= n \ f{a, . . . ,a) | ^(a, . . . , a) 

de Bruijn contexts E ::= □ | /(a, . . . , E, . . . ,a) \ ^(a, . . . ,E, . . . ,a) 

We use a,b,Qi,bi, . . . for de Bruijn terms and E,F, . . . for de Bruijn contexts. 
The binder path number of a context is the number of binders between the □ 
and the root. For example the binder path of if = /(3, C(i, A(2, DjS)), 2) is 2. 

Remark that de Bruijn terms are also de Bruijn pre-metaterms, that is, 
Tdb C VAiTdbi although note that some de Bruijn terms may not be de Bruijn 
metaterms, i.e. may not be well-formed de Bruijn pre-metaterms, e.g. ^(^(4)). 
The result of substituting a term b for the index n > 1 in a term a is denoted 
a\n G- b^. This is defined as usual [E]. We now recall the definition of rewrite 
rules, valuations, their validity, and reduction in SERSdb- 

Definition 6 (de Bruijn rewrite rule). A de Bruijn rewrite rule is a pair of 
de Bruijn metaterms {L,R) (also written L — > R) such that the first symbol in 
L is a function symbol or a binder symbol, NMVar(R) C NMVar(L), and the 
metasubstitution operator .|.] does not occur in L. 



Definition 7 (de Bruijn valuation). A de Bruijn valuation k is a (partial) 
function from t-metavariables to de Bruijn terms. A valuation k determines in 
a unique way a function n (also called valuation) on pre-metaterms as follows: 



Kn =def n 

nXi =def kXi 



Kf (^Ai , . . . , ^yi) — def f {biAi , . . . , kAji) 

. . . , A1^) — def ^(^^1: ■ - ■ 1 bcAT) 
^(^l|Al2]) —def ^(^l)'Sl blA 2 ^ 



We write Dom{n) for the set {Xi \ kXi is defined}, called the domain of k. 

We now introduce the notion of value function which is used to give semantics 
to metavariables with labels in the SERSdb formalism. The goal pursued by 
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the labels of metavariables is that of incorporating “context” information as 
a defining part of a metavariable. A typical example is given by a rule like 
C : C(^(-^/ 3 a)) — t where the AT-occurrence on the RES of the rule 

denotes a permutation of the binding context of the Jf-occurrence on the LHS. 

As a consequence, we must verify that the terms substituted for every oc- 
currence of a fixed metavariable coincide “modulo” their corresponding context. 
Dealing with such notion of “coherence” of substitutions in a de Bruijn formal- 
ism is also present in other formalisms but in a more restricted form. Thus for 
example a pre-cooking functiorfl is used in P] to avoid variable capture in the 
higher-order unification procedure. In XRS m the notions of binding arity and 
pseudo-binding arity are introduced to take into account the binder path number 
of rewrite rules. Our notion of “coherence” is implemented with valid valuations 
(cf. Definition E|) and it turns out to be more general than the solutions proposed 
in PI and [25 • 

Definition 8 (Value function). Let a be a de Bruijn term and I be a label of 
binder indicators. We define the value function Value{l, a) as Value^(l, a) where 



{ n if n <i 

at(Z, n — i) ifO<n — i< |/| 
ifn-i>\l\ 

Value^{l, /(oi, . . . , a„)) =def f{Value\l, Oi), . . . Value\l, a„)) 

Value^{l, 5(ai, . . . , a„)) =def ^(Value^+^(l, Oi), . . . ,Value’+^(l, a„)) 

It is worth noting that Value^{l,n) may give three different kinds of 
results. This is just a technical resource. Indeed, Value{al3,^{f (3,1))) = 
e(/(/3,l)) = Value{f3a,afa,m and Va;«e(e, /(e(i), A(2))) = /(^(l), A(xi)) ^ 

/(C(i))A(a)) =Value{a, f{f{l), \{2j))). Thus the function Value(l,a) interprets 
the de Bruijn term a in an /-context: bound indices are left untouched, free in- 
dices referring to the /-context are replaced by the corresponding binder indicator 
and the remaining free indices are replaced by adequate variable names. 

Definition 9 (Valid de Bruijn valuation). A de Bruijn valuation k is said 
to be valid if for every pair of t-metavariables Xi and Xii in Dom^n) we have 
Value{l, kXi) = V alue{V , kXv) . Likewise, we say that a de Bruijn valuation n 
is valid for a rewrite rule (L, R) if for every pair of t-metavariables Xi and Xi> 
in {L,R) we have Value{l, kXi) = V alue{V , kXv) . 

Example 2. Let us consider the de Bruijn rule C : ^(^(V/ 3 q)) — >• ^(^(Vq,/?)). We 
have that k = Vc,/?/!} is valid since Value{(3a,‘Q = a = Value{aP,l). 

^ The pre-cooking function translates a de Bruijn A-term with metavariables into a 
Acr-term by suffixing each metavariable X with as many explicit shift operators as 
the binder path number of the context obtained by replacing A by □. This avoids 
variable capture when the higher-order unification procedure finds solutions for the 
t- met avariables . 
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As already mentioned the ? 7 -contraction rule Xx.app{X,x ) — > X can be ex- 
pressed in the SERSdb formalism as the rule (?7dB) X{app{Xa ,\)) — 

Our formalism, like other HORS in the literature, allows us to use rules like 
because valid valuations will test for coherence of values. 

Definition 10 (b'ifi^b'^iB-rewriting). Let TZ be a set of de Bruijn rewrite rules 
and a, b be de Bruijn terms. We say that a 7^-reduces or rewrites to b, written 
a — >Ti b, iff there is a de Bruijn rule {L, R) G TZ, a de Bruijn valuation k valid 
for (L,R), and a de Bruijn context E such that a = E[kL] and b = E[kR]. 

Thus, the term X{app{X{app{l,3)),l)) rewrites by the rjdB rule to 
X{app{l,T)), using the (valid) valuation 
K = {Xa/X{app{l,3)),X„/X{app{l,2))}. 

3 The First-Order Framework 

In this section we introduce the first-order formalism called Explicit Expression 
Reduction Systems {ExERSj used to translate higher-order rewriting systems 
based on de Bruijn indices into first-order ones. 

Definition 11. A substitution declaration is a (possibly empty) word over the 
alphabet {T, S}. The symbol T is used to denote terms and S to denote substi- 
tutions. A substitution signature is a set Es of substitution symbols equipped 
with an arity n and a substitution declaration of length n. We use a : (w) where 
w € {T, S}" if the substitution symbol a has arity n and substitution declaration 
w. We use e to denote the empty word. 



Definition 12 {ExERS term algebra). An ExERS signature is a set E = /yU 
Ei, U Eg where Ef = {/i, . . . , /„} is a set of function symbols, Ij, = {Ai, . . . , A„} 
is a set of binder symbols, Es a substitution signature and Ef, Eb and Es are 
pairwise disjoint. Both binder and function symbols come equipped with an ar- 
ity. Given a set of (term) variables V = {Xi, X 2 , . . .}, the term algebra of an 
ExERS of signature E generated by V, is denoted by T and contains all the objects 
(denoted by letters o and p) generated by the following grammar: 

indices n ::= 1 I S(n) 

terms (T) a ::= A | n | a[s] | /(oi, . . . , a„) | ^(oi, . . . , a„) 

substitutions (S) s ::= cr(oi, . . . , o„) 

where X ranges over V, f over Ef,( over Eb, and a over Eg. The arguments of 
a are assumed to respect the sorts prescribed in its substitution declaration and 
function and binder symbols are assumed to respect their arities. 

Letters a,b,c, . . . and s, Si, . . . are used for terms and substitutions, respec- 
tively. The .[.] operator is called the substitution operator. Binder symbols and 
substitution operators are considered as having binding power. We shall use a[s]” 
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to abbreviate a[s] . . . [s] (n-times). Terms without occurrences of the substitution 
operator (resp. objects in V) are called pure (resp. ground) terms. A context is a 
ground term with one (and only one) occurrence of a distinguished term variable 
called a “hole” (and denoted □). Letters E,Ei,... are used for contexts. The 
notion of hinder path number is defined for pure contexts exactly as in the case 
of de Bruijn contexts. 

The formalism of ExERS that we are going to use in order to encode higher- 
order rewriting consists of two sets of rewrite rules: 

1. A set of proper rewrite rules governing the behaviour of the function and 
binder symbols in the signature. 

2. A set of substitution rules, called the substitution calculus governing the 
behaviour of the substitution symbols in the signature, and used for propa- 
gating and performing/eliminating term substitutions. 

Let us define these two concepts formally. 

Definition 13 (Substitution macros). Let Eg be a substitution signature. 
The following symbols not included in Tg are called substitution macros.' cons : 
(TS), lift : (S), id : (e) and shift^ : (e) for j > 1. We shall abbreviate shift^ by 
shift. Also, if j > 0 then lift^ {s) stands for s if j = 0 and for lift{lift^~^{s)) 
otherwise. Furthermore, cons{ai, . . . ,Oi, s) stands for cons {ai, ... cons {oi, s)). 

Definition 14 (Term rewrite and equational systems). Let F be an Ex- 
ERS signature. An equation is a pair of terms L = R over F such that L and 
R have the same sort and a term rewrite rule is a pair of terms {L, R) over F 
such that (1) L and R have the same sort, (2) the head symbol of the LHS of 
the rule is a function or a binder symbol, and (3) the set of variables of the LHS 
includes those of the RHS. An equational (resp. term rewrite) system is a set of 
equations (resp. term rewrite rules). 

Definition 15 (Substitution calculus). A substitution calculus over an Ex- 
ERS signature F consists of a set W of rewrite rules, and an interpretation of 
each substitution macro as some combination of substitution symbols from If, of 
corresponding signature. Definition \TE shall require certain properties for these 
interpretations to be considered meaningful. 

An example of a substitution calculus is cr with cons{t,s) =def t ■ s, 
lift{s) =def 1 • (so t), id =def id and shifF =deft ° ■ (t ° t)i where f appears 

j times. 

Definition 16 (Basic substitution calculus). A substitution calculus W 
over F is said to be basic if the following conditions are satisfied: 

L W is complete (strongly normalizing and confluent) over the ground terms 
in T. We use W(a) to indicate the unique W -normal form of a. 

2. W -normal forms of ground terms are pure terms. 
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3. For each f G Ff and ^ G F^: W(/(ai, . . . , a„)) = /(W(ai), . . . , W(a„)) and 
W(^(ai, . . . , a„)) = ^(W(ai), . . . , W(an)). 

4-- Rules for propagating substitutions over function symbols and binders are 
contained in W, for each f G Ff and f G F},: 

ifuncf) f{X\ . . . , X")[s] ^ f{X^[si . . . , X"[s]) 

(bmd^) ^(X\...,X^)[s] ^ axHms)],---,X^[Uft{s)]) 

5. For every substitution s, l[lift(s)] =w 1- 

6. For every substitution s and every m> 0, m + l \lift(s)] =w rn[s][shift]. 

7. For every term a and substitution s we have l[cons(a, s)] =vy a. 

8. For every term a, substitution s, m > 0 we have m + 1 [cons (a, s)] =vy m[s]. 

9. For every m,j >1 we have mfshifF] =w rn + j. 

10. For every term a we have a[id] =w a- 

Example 3. The a uni and 4> 1^ calculi are basic substitution calculi 

where the set of function and binder symbols are {app} and {A}, respectively. 

The reader may have noted that the macro-based presentation of substi- 
tution calculi makes use of parallel substitutions (since cons{.,.) has substitu- 
tion declaration TS). Nevertheless, the results presented in this work may be 
achieved via a macro-based presentation using a simpler set of substitutions 
(such as for example the one used in II3)> where scons{.) has substitution dec- 
laration T and the macro shift'' is only defined for i = 1. Indeed, the expression 
6[cons(ai, . . . , a„, shifF)] could be denoted by the expression 

b[lift" {shift)Y [scons{ai[shift]"~^)] . . . [scons(a„)] 

Definition 17 {ExERS and EExERS). Let F be an ExERS signature, W a 
basic substitution calculus over F and TZ a set of term rewrite rules. If each 
rule of TZ has sort T then TZw =def {R, R-j kV) is called an Explicit Expression 
Reduction System (ExERS). If, in addition, the LHS of each rule in TZ contains 
no occurrences of the substitution operator .[.] then TZw is called a Fully Explicit 
Expression Reduction System (FExERS). 

Since reduction in SERSdb only takes place on terms, and first-order term 
rewrite systems will be used to simulate higher-order reduction, all the rules of 
a term rewrite system TZ are assumed to have sort T. However, rewrite rules of 
W may have any sort. 

Example 4- Consider the signature F formed hy F f = {app}, Fi, = {A} and If, 
any substitution signature. Let W be a basic substitution calculus over F. Then 
for TZ : app{X{X),Y) — >pdb X[cons(Y, id)] we have that TZ\y is an FExERS, and 
for TZ' :TZU {X{app{X[shift], 1)) — >r)db X}, TZ'y^ is an ExERS. 

Reduction in an ExERS TZw is first-order reduction in TZ modulo W-equality. In 
contrast, reduction in a EExERS TZw is just first-order reduction in T^UkV. Before 
defining these notions more precisely we recall the definition of assignment. 
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Definition 18 (Assignment (a.k.a. grafting)). Letp be a (partial) function 
mapping variables in V to ground terms. We define an assignment p as the 
unique homeomorphic extension ofp over the set T. 

Definition 19 (Reduction and Equality). Let o and p be two ground terms 
of sort T or S. Given a rewrite system TZ, we say that o rewrites to p in one step, 
denoted o — >ti p, iff o = E[pL] and p = E[pR] for some assignment p, some 
context E and some rewrite rule {L, R) in TZ. We shall use to denote the 

reflexive transitive closure of the one-step rewrite relation. 

Given an equational system £, we say that o equals p modulo £ in one step, 
denoted o =g p, iff o = E[pL] and p = E[pFt\ for some assignment p, some 
context E and some equation L = R in £. We use =s to denote the reflexive 
symmetric transitive closure of =\, and say that o equals p modulo £ if o =s p. 

Definition 20 {ExERS and FExERS-rewriting) . Let TZw be an ExERS, TZ'w 
a FExERS and o,p ground terms of sort S or T. We say that o 7?-w-reduces or 
rewrites to p, written o — p, iff o — >n/w P (i-e- o =w o ' — >r p' =w p); 
and o 7?.(^-reduces to p, iff o — P- 



3.1 Properties of Basic Substitution Calculi 



This subsection takes a look at properties enjoyed by basic substitution calculi 
and introduces a condition called the Scheme HZl- Basic substitution calculi 
satisfying the scheme ease inductive reasoning when proving properties over them 
without compromising the genericity achieved by the macro-based presentation. 



Lemma 1 (Behavior of Substitutions in Basic Substitution Calculi). 

Let W be a basic substitution calculus and m > 1. 



1. For alln>0 and s ^n S: rn[Ufr(s)] =w ^ ^f m > n 



m if m < n 

2. For all n > m > 1 and all terms a\, . . . ,an'. m[cons{ai, . . . , an, s)] =w Om 

3. For all pure terms a,b and m > 1: a^m ■<— =w a[lift"'~^{cons{b,id))]. 



Definition 21 (The Scheme). We say that a basic substitution calculus W 
obeys the scheme iff for every index m and every substitution symbol a G Fg of 
arity q one of the following two conditions hold: 

1. There exists a de Bruijn index n, positive numbers ii,...,ir (r > 0) and 
substitutions u\, . . . ,Uk {k > 0) such that 

— I < ii, ... ,ir q and all the ij ’s are distinct 

- for all oi,...,Oq we have: m[cr(oi, . . . ,o,)] =w nKJ ■ • ■ [ov][mi] • • ■ [uk] 

2. There exists an index z (1 < i < 9 ) such that for all oi,...,Oq we have: 

m[a{oi, . . .,Oq)] =w Oi 

We assume these equations to be well-typed: whenever the first case holds, then 
Oi^, . . . , Oi^ are substitutions, whenever the second case holds, Oi is of sort T. 



Example 5. Example of calculi satisfying the scheme are a, cr.ff, v, f and d jl/j . 
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4 Prom Higher-Order to First-Order Rewriting 

In this section we give an algorithm, referred to as the Conversion Procedure, to 
translate any higher-order rewrite system in the formalism SERSdb to a first- 
order ExERS. The Conversion Procedure is somewhat involved since several 
conditions, mainly related to the labels of t-metavariables, must be met in order 
for a substitution to be admitted as valid. The idea is to replace all occurrences 
of t-metavariables Xi by a first-order variable X followed by an appropriate 
index- adjusting explicit substitution which computes valid valuations. 

We first give the conversion rules of the translation, then we prove its prop- 
erties in Section KT\ 

Definition 22 (Binding allowance). Let M be a metaterm and 
{Wi ) ■ • ■ , be the set of all the t-metavariables with name X occur- 
ring in M. Then, the binding allowance of X in M, noted Ba.M(X), is the set 
rii=i h- Likewise, we define the binding allowance of X in a rule (L,R), written 

Example 6. Let M = /(^(Xq,), ^(A(W^ a)), t^(^(Xo,.y))), then Ba,M{X) = {a}. 

Definition 23 (Shifting index). Let M be a metaterm, Xi a t-metavariable 
occurring in M, and i a position in 1. The shifting index determined by Xi in 
M at position i, denoted Sh(M, Xi,i), is defined as 

Sh{M,Xi,i) =def \{j I at{l,j) ^ B&M{X),j € l..i - 1}| 

Sh(M, Xi, i) is just the total number of binder indicators in I at positions l..i—l 
that do not belong to Ba.M{X) (thus Sh(M,X/,l) is always 0/ Likewise, we 
define the shifting index determined by Xi in a rule (L, R) at position i, written 
Sh{{L,R),Xi,i). 

Example 7. Let M = f{f{Xa),^{X{Xf 3 a,)),v{f{X^^))). Then Sh.{M, X/s^,2) = 1 
and Sh(M, 1) = Sh(M, 2) = 0. 

Definition 24 (Pivot). Let {L,R) be a SERSdb - rewrite rule and let us sup- 
pose that {Wj,...,W„} is the set of all X -based t-metavariables in (L,R). Lf 
Ba(i_i{)(X) 7 ^ 0, then Xi. for some j € l..n is called an (X-based) pivot if 
\lj\ "S |^i| for all i € l..n, and X[. G L whenever possible. A pivot set for a 
rewrite rule (L, R) is a set of pivot t-metavariables, one for each name X in 
L such that 7^ 0- This notion extends to a set of rewrite rules as 

expected. 

Note that Definition El admits the existence of more than one X-based 
pivot t-metavariable. A pivot set for (L, R) fixes a t-metavariable for each t- 
metavariable name having a non-empty binding allowance. 

Example 8. Both t-metavariables Xap and can be chosen as A-based pivot 
in the rewrite rule TZ : /mpZjes(3(V(Ao,,3)), V(3(A,3 q,))) — > true. In the rewrite 
rule TZ' : /(W, A(^(Aq,/ 3 )), z^(A(A/ 3 q))) — >• p{Xa,Ya) the t-metavariable Aq, is 
the only possible A-based pivot. 
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Definition 25 (Conversion of t-metavariables). Consider a SERSdb- 
rewrite rule (L,R) and a pivot set for {L,R). We consider the following cases 
for every t-metavariahle name X occurring in L: 

1. = 0. Then replace each t-metavariahle Xi in {L,R) hy the 
metaterm X[shift^''^], and those t-metavariables Xi with I = e simply by X . 
This shall allow for example the rule f {\{app{Xa, 1) , X^)) — > X^, to be con- 
verted to f{\{app{X[shift]^l),X )) — ^ X. 

2. Ba(/, /{)(X) = {/3i, . . . ,/3m} with m > 0. Let Xi be the pivot t-metavariable 
for X given by the hypothesis. We replace all occurrences of a t-metavariable 
Xk in {L,R) by the term X[cons{bi, . . . ,b\i\, shifC)] where j =def |fc| + |^ \ 
Ba(i_/{)(X)| and the bi’s depend on whether Xk is a pivot t-metavariable or 
not. As an optimization and in the particular case that the resulting term 
X[cons{bi, . . . , &|/|, shifT)] is of the form cons{l , . . . , |Z|, shift^^^), then we sim- 
ply replace Xk by X. The substitution cons{bi, . . . ,b\i\, shifT) is called the 
index-adjusting substitution corresponding to Xk and it is defined as follows: 

a) if Xk is the pivot (hence I = k), then bi = i if azt {I, i) G o.nd 

bi = |Z| -I- 1 -b Sii{{L,R),XiA) ifa.t{l,i) ^ Ba(L,R){X). 

b) if Xk is not the pivot then bi = pos(/3/j,fc) if i = pos(/3/j,Z) for some 
Ph ^ ^3i{L,R){X) and bi = \k\ -\- 1 -\- Sh{{L, R), Xi,i) otherwise. 

Note that for an index-adjusting substitution X[cons{bi, . . . , &|;|, shifC)] each 
bi is a distinct de Bruijn index and less than or equal to j. Substitutions of this 
form have been called pattern substitutions in uni, where unification of higher- 
order patterns via explicit substitutions is studied. 

Definition 26 (The Conversion Procedure). Given a SERSdb R the fol- 
lowing actions are taken: 

1. Convert rules. The transformation of Definition 1^ is applied to all rules 

in TZ prior selection of some set of pivot sets for TZ. 

2. Replace metasubstitution operator. All submetaterms of the form 

M|iV] in TZ are replaced by the term substitution operator M [cons {N, id)]. 



Example 9. Below we present some examples of conversion of rules. We have 
fixed W to be the cr-calculus. 



SERSdb rule 


Pivot selected 


Transformed rule 


A(app(X„,l))^ W 


- 


\(app{Xp],l))^ X 


A(A(X„^))^ \{X{XfiA) 




A(A(.Y))^ A(A(X[2.1-(t°t)])) 


Hx.,) 


- 


/(A(A(A'[t o t])). A(A(A[t 0 t])))-^ A(A[t]) 


app{X{Xcx), Z^) Aa[Ze] 


Xa,Z^ on LHS 


app{X{X),Z) — >■ X[cons{Z,id)] 



Let (L, R) be a SERSoB-mle and P a pivot set for {L, R). We write Cp{L, R) 
for the result of applying the conversion of Definition EEl to (L, R) with pivot set 
P. We refer to Cp{L,R) as the converted version of (L,R) via P. 

Note that if the SERSop-rewiite rule {L, R) which is input to the Conversion 
Procedure is such that for every name X in (L, R) there is a label I with all 
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metavariables in (L,R) of the form Xi, then all Xi are replaced simply by X. 
This is the case of the rule [3db of Example El 

Also, observe that if we replace our cons(.,.) macro by a scons (.) of sub- 
stitution declaration T as defined in ca then the “Replace metasubstitution 
operator” step in Definition ESI converts a metaterm of the form M|7V] into 
M[scons(N)], yielding first-order systems based on substitution calculi, such as 
Au, which do not implement parallel substitution. 

The resulting system of the Conversion Procedure is coded as an ExERS, 
a framework for defining first-order rewrite systems where f-matching is used, 
£ being an equational theory governing the behaviour of the index adjustment 
substitutions. Moreover, if it is possible, an ExERS may further be coded as a 
F ExERS where reduction is defined on first-order terms and matching is just 
syntactic first-order matching, obtaining a full first-order system. 

Definition 27 (First-order version of TV). Let E be an ExERS signature 
and let TZ be a SERSdb- Consider the system fo{TZ) obtained by applying the 
Conversion Procedure to TZ and let W be a substitution calculus over E. Then 
the ExERS fo{TZ)w is called a first order-version of TZ. 

In what follows we shall assume given some fixed basic substitution calculus 
W. Thus, given a SERSdb TZ we shall speak of the first-order version of TZ. This 
requires considering pivot selection, an issue we take up next. 

Assume given some rewrite rule {L, R) and different pivot sets P and Q for 
this rule. It is clear that Cp{L,R) and Cq{L,R) shall not be identical. Never- 
theless, we may show that the reduction relation generated by both of these 
converted rewrite rules is identical. 

Proposition 1. Let {L,R) be a SERSdb - rewrite rule and let P and Q be differ- 
ent pivot sets for this rule. Then the rewrite relation generated by both Cp{L, R) 
and Cq{L, R) are identical. 



4.1 Properties of the Conversion 

The Conversion Procedure satisfies two important properties: each higher-order 
rewrite step may be simulated by first-order rewriting (simulation) and rewrite 
steps in the first-order version of a higher-order system TZ can be projected in 
TZ [conservation). 

Proposition 2 (Simulation). Let TZ be an SERSdb o,nd let fo[TZ)w be 
its first-order version. Suppose a — b. Lf fo[TZ)w is an ExERS then 
a — >fo{n)/w b. If fo[TZ)w is a EExERS then a — >fo(n) ° b where o 

denotes relation composition. 



Proposition 3 (Conservation). Let TZ be a SERSdb and fo[TZ)w Us first- 
order version with W satisfying the scheme. If a — >fo{n)w ^ then W(a) 

W[b). 
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4.2 Essentially First-Order HORS 

This last subsection gives a very simple syntactical criterion that can be used to 
decide if a given higher-order rewrite system can be converted into a full first- 
order rewrite system (modulo an empty theory). In particular, we can check that 
many higher-order calculi in the literature, e.g. A-calculus, verify this property. 

Definition 28 (Essentially first-order HORS). A SERSjjb ^ is called es- 
sentially first-order if fo{TZ)y\; is a FExERS for W a basic substitution calculus. 

Definition 29 (fo-condition). A SERSdb ^ satisfies the fo-condition if every 
rewrite rule (L, R) € TZ satisfies: for every name X in L let Xi ^, . . . , Xi^ be all 
the X -based t-metavariables in L, then l\ = I 2 ■ ■ ■ = In o,nd (the underlying set 
of) h is and for all Xk & R we have |fc| > |Zi|. 

In the above definition note that li = I 2 ■ ■ ■ = In means that labels li, ... fin 
must be identieal (for example afi (3a). Also, by Definition El h is simple. 

Example 10. Consider the /3dh-calculus: app{X{XQ.), Zfi — Xc|Ze]. The fidb- 
calculus satisfies the fo-condition. 

Proposition Elputs forward the importance of the fo-condition. Its proof relies 
on a close inspection of the Conversion Procedure. 

Proposition 4. Let R be a SERSbb satisfying the fo-condition. Then R is 
essentially first-order. 

Note that many results on higher-order systems (e.g. perpetuality m, stan- 
dardization m) require left-linearity (a metavariable may occur at most once on 
the LHS of a rewrite rule), and fully- extendedness or loeality (if a metavariable 
X{ti, . . . , tn) occurs on the LHS of a rewrite rule then ... fin is the list of 
variables bound above it). The reader may find it interesting to observe that 
these conditions together seem to imply the fo-condition. A proof of this fact 
would require either developing the results of this work in the above mentioned 
HORS or via some suitable translation to the SERSdb formalism. 

5 Conclusions and Future Work 

This work presents an encoding of higher-order term rewriting systems into first- 
order rewriting systems modulo an equational theory. This equational theory 
takes care of the substitution process. The encoding has furthermore allowed 
us to identify in a simple syntactical manner, via the so-called fo-condition, a 
class of HORS that are fully first-order in that they may be encoded as first- 
order rewrite systems modulo an empty equational theory. This amounts to 
incorporating, into the first-order notion of reduction, not only the computation 
of substitutions but also the higher-order (pattern) matching process. It is fair to 
say that a higher-order rewrite system satisfying this condition requires a simple 
matching process, in contrast to those that do not satisfy this condition (such as 
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the A/3?7-calculus) . Other syntactical restrictions, such as linearity and locality, 
imposed on higher-order rewrite systems in jlDl'ilj in order to reason about their 
properties can be related to the fo-condition in a very simple way. This justifies 
that the fo-condition, even if obtained very technically in this paper, may be 
seen as an interpretation of what a well-behaved higher-order rewrite system is. 

Moreover, this encoding has been achieved by working with a general pre- 
sentation of substitution calculi rather than dealing with some particular sub- 
stitution calculus. Any calculus of explicit substitutions satisfying this general 
presentation based on macros will do. 

Some further research directions are summarized below: 

— As already mentioned, the encoding opens up the possibility of transferring 
results concerning confluence, termination, completion, evaluation strategies, 
implementation techniques, etc. from the first-order framework to the higher- 
order framework. 

— Given a SERSdb E. note that the LHSs of rules in fo{TZ) may contain oc- 
currences of the substitution operator (pattern substitutions). It would be 
interesting to deal with pattern substitutions and “regular” term substitu- 
tions (those arising from the conversion of the de Bruijn metasubstitution 
operator .|.]) as different substitution operators at the object-level. This 
would neatly separate the explicit matching computation from that of the 
usual substitution replacing terms for variables. 

— This work has been developed in a type-free framework. The notion of type is 
central to Computer Science. This calls for a detailed study of the encoding 
process dealing with typed higher-order rewrite systems such as HRS m- 

— The ideas presented in this paper could be used to relax conditions in jOj 
where only rewrite systems with atomic propositions on the LHSs of rules 
are considered. 

Acknowledgements. We thank Bruno Guillaume for helpful remarks. 
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Abstract. We present an algorithm for unification of higher-order pat- 
terns modulo combinations of disjoint hrst-order equational theories. 
This algorithm is highly non-deterministic, in the spirit of those by 
Schmidt-Schaufi m and Baader-Schulz 0 in th® first-order case. We 
redefine the properties required for elementary pattern unification al- 
gorithms of pure problems in this context, then we show that some 
theories of interest have elementary unification algorithms fitting our 
requirements. This provides a unihcation algorithm for patterns mod- 
ulo the combination of theories such as the free theory, commutativity, 
one-sided distributivity, associativity-commutativity and some of its ex- 
tensions, including Abelian groups. 



Introduction 

Patterns have been defined by Miller m in order to provide a compromise 
between simply-typed lambda-terms for which unification is known to be unde- 
cidable 2l9j and mere first-order terms which are deprived of any abstraction 
mechanism. A pattern is a term of the simply-typed lambda-calculus in which the 
arguments of a free variable are all pairwise distinct bound variables. Patterns 
are close to first-order terms in that the free variables (with their bound vari- 
ables as only permitted arguments) are at the leaves. Under this rather drastic 
restriction, unification becomes decidable and unitary: 

Theorem 1 (|l8j). It is decidable whether two patterns are unifiable, and there 
exists an algorithm which computes a most general unifier of any two unifiable 
patterns. 

Yet, patterns are useful in practice for defining higher-order pattern rewrite 
systems [TTiei . or for defining functions by cases in functional programming 
languages. Some efforts have been devoted to the study of languages combining 
functional programming (lambda-calculus) and algebraic programming (term 
rewriting systems) j 1 1 3l7j . 

In this paper we provide a nondeterministic algorithm for combining ele- 
mentary equational patterns unification algorithms. This is the object of section 
121 In sections 0 and 0 we show that such elementary unification algorithms 
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exist for theories such as the free theory, commutativity, one-sided distributiv- 
ity, as well as associativity-commutativity and its common extensions including 
Abelian groups. 

Our method does not consist of using a first-order unification algorithm for 
the combined equational theories extended to the case of patterns. Such an 
approach has been used by Qian & Wang but considering a first-order 
unification algorithm as a black box leads to incompleteness (see example in | 5 |). 
What we need here is a pattern unification algorithm for each of the theories 
to be combined plus a combination algorithm for the elementary ifi-pattern 
unification algorithm. Evidence of this need is that contrarily as in the first- 
order case, the unifier set of the pure equation Xxy.Fxy = Xxy.Fyx modulo 
the free theory (no equational axioms) changes if one adds say a commutative 
axiom x + y = y + x (see m)- The requirements we have on elementary pattern 
unification algorithms are very much in the spirit of those needed in the first- 
order case, yet there are relevant differences due in particular to the possible 
presence of equations with the same free variable at the head on both sides (like 
Xxy.Fxy = Xxy.Fyx). 

The worst difficulties, for the combination part as well as for the elemen- 
tary unification algorithms, come from such equations which have no minimal 
complete sets of E-unifiers, even for theories which are finitary unifying in the 
first-order case. On top of this, the solutions of such equations introduce terms 
which are not patterns. For this reason, we will never attempt to solve such 
equations explicitly, but we will keep them as constraints. The output of our 
algorithm is a (DAG-) solved form constrained by some equations of the above 
form and compatible with them. As the algorithm by Baader and Schulz ours 
can be used for combining decision procedures. 

1 Preliminaries 

We assume the reader is familiar with simply-typed lambda-calculus, and equa- 
tional unification. Some background is available in e.g. (mn for lambda- 
calculus and E-unification. 



1.1 Patterns and Equational Theories 

Miller (E) has defined the patterns as those terms of the simply-typed lambda- 
calculus in which the arguments of a free variables are ( 77 -equivalent to) pair- 
wise distinct bound variables: Xxyz.f{H{x,y), F[{x, z)) and Xx.F{Xz.x{z)^dxe 
patterns while Xxy.G{x,x,y), Xxy.FI{x,f{y)) and Xxy.H{F{x),y) are not pat- 
terns. We shall use the following notations: the sequence of variables xi, . . . , 
will be written xF. or even 3; if is not relevant. Hence Xx\ - ■ ■ Xxn.s will be 
written Xx^.s, or even Xx.s. If in a same expression x appears several times 



^ We will always write such a pattern in the (?7-equivalent) form Xx.F{x), where the 
argument of the free variable F is indeed a bound variable. 
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it denotes the same sequence of variables. If tt is a permutation of n}, 

denotes the sequence a;7r(i), ■ ■ ■ , a;7r(n)- In the following, we shall use either 
Xx.F{x^) or the a-equivalent term Xy‘^.F{y), where ip = in order to denote 
Xxi ■ ■ ■ Xxn-F{x^{ip . . . , The curly-bracketed expression {x^} denotes the 

(multi) set {x\, . . . ,Xn}- In addition, we will use the notation t{ui, . . . ,Un) or 
f(ulf) for (• • • (t ui) • • • Un)- The free variables of a term t are denoted by FV{t). 

t\p is the subterm of t at position p. The notation t[u]p stands for a term 
t with a subterm u at position p, t[ui, . . . ,Un] for a term t having subterms 

111 5 ■ ■ ■ 5 Un- 

Unless otherwise stated, we assume that the terms are in p-long / 3 -normal 
form cni, the (3 and rj rules being respectively oriented as follows: {Xx.M)N 
M{x 1-^ N} and F — Xx^.F{x^) if the type of F is «i —>■ a, and 

a is a base type. In this case, F is said to have arity n. The 77-long / 3 -normal 
form of a term t is denoted by 

A substitution cr is a mapping from a finite set of variables to terms of the 
same type, written a — {Xi >->• ti, . . . , A„ >->■ t„}. The set {X\, . . . , X„} is called 
the Domain of a and denoted by T>om((r). 

The equational theories we consider here are the usual first-order equational 
theories: given a set E of (unordered) first-order axioms built over a signature 
F, =£; is the least congruenc^ containing all the identities la — ra where I = 
r € E and tr is a suitably typed substitution. =rji3E is then the least congruence 
containing =e, =p and =p. 

The following is a key theorem by Tannen. It allows us to restrict our atten- 
tion to =E for deciding 77-/3- F-equivalence of terms in 77-long, / 3 -normal form: 

Theorem 2 ([ 7 J). Let E be an equational theory and s and t two terms. Then 
s =T,f 3 E t sf^^=Et 

1.2 Unification Problems 

Unification problems are formulas built-up using only the equality predicate 
= (between terms), conjunctions, disjunctions and existential quantifiers. The 
solutions of s = t are the substitutions a such that sa =tii3e This definition 
extends the natural way to unification problems. We restrict our attention to 
problems of the form ( 3 A) si = U A • • • A s„ = the only disjunctions being 
implicitly introduced by the non-deterministic rules. 

Terminology. In the following, free variable denotes an occurrence of a variable 
which is not A-bound and bound variable an occurrence of a variable which 
is A-bound. To specify the status of a free variable with respect to existential 
quantifications, we will explicitly write existentially quantified or not existentially 
quantified. In the sequel, upper-case F, G, X,... will denote free variables, a, b, 
/, g,... constants, and x, y, z, x±,... bound variables. 

Without loss of generality, we assume that the left-hand sides and right-hand 
sides of the equations have the same prefix of A-bindings. This is made possible 

^ compatible also with application and A-abstraction in our context. 
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(by using a-conversion if necessary) because the two terms have to be in 77 -long 
/3-normal form and of the same type. In other terms, we will assume that the 
equations are of the form Ax.s = \x.t where s and t do not have an abstraction 
at the top. 

Definition 1. A flexible pattern is a term of the form Xx.F{y) where F is a free 
variable and {y} C {xf\. A flex-flex equation is an equation between two flexible 
patterns. An equation is quasi-solved if it is of the form Xxk.F(jfn) = Xxk-s and 
FV{s)r\{lCk} C {iFf} and F ^ FV{s)A{lCk}. A variable is solved in a unification 
problem if it occurs only once as the left-hand side of a quasi-solved equation. 



Lemma 1. If the equation XWk.F(jflf) = Xxk.s is quasi-solved, then it is has the 
same solutions as XyX,.F{ifa) = XyX.s and (by iq- equivalence) as F = Xyh.s. A 
most general unifier of such an equation is {F 1 — >■ Xyh-s}. 

For the sake of readability, we will often write a quasi-solved equation in the 
form F = Aylf.s instead of Xxk.Fiflflf) = Xxk.s. 

Definition 2. A DAG-solved form is a problem of the form (3Yi • • • Ym) Xi = 
Si A • ■ • A Xn = Sn where for 1 < i < n, Xi and Si have the same type, and 
Xi ^ Xj for i ^ j and Xi ^ FV{sj) for i < j . A solved form is a problem of 
the form (3Fi • • • Y^) Xi = si A • • • A Xn = Sn where for 1 < i < n, Xi and 
Si have the same type, Xi is not existentially quantified, and Xi has exactly one 
occurrence. 

A solved form is obtained from a DAG-solved form by applying as long as 
possible the rules: 



Quasi-solved Aa;fe.F(y„) = Xxk.s A P -A F = A?/„.s A P 

Replacement F = Xlfn.s A P -A F = XyX.s A H> Xyh.s} 

if F has a free occurrence in P. 

EQE (3F) F = t A P -A P if F has no free occurrence in P. 



1.3 Combinations of Equational Theories 



As in the first-order case |22l I dljSlH] . we will consider a combination of equa- 
tional theories. We assume that Eq, . . . , En are equational theories over disjoint 



signatures Fq, . . . and we will provide a unification algorithm for the the- 
ory E presented by the reunion of the presentations, provided an elementary 
Ei -unification algorithm for patterns is known for each Ei. As in the first-order 
case, we have some further assumptions on elementary Fi-unification that will 
be made precise later. 
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Definition 3. The theory of a bound variable, or of a free algebraic symbol i.e., 
a symbol which does not appear in any axiom is the free theory Eq . The theory of 
an algebraic symbol f G iFi is Ei. A variable F is E^ -instantiated in a unification 
problem P if it occurs in a quasi-solved equation F = s where the head of s has 
theory Ei. A variable F is Ei-instantiated by a substitution a if the head of Fa 
has theory Ei. 

We first give two well-known rules in order to obtain equations between pure 
terms only. 

Definition 4. The subterm u of the term t[u]p is an alien subterm in t if it 
occurs as an argument of a symbol from a different theory from its head symbol. 



VA Aai.s[it]p = Xx.t — >■ 3F Aai.s[F]p = Xx.t A Xy.F{y) = Xy.u 
if u is an alien subterm and {y} = {ir} fl iFV{u) and is a new variable of 
appropriate type. 

Split Xx.^{s) = Xx.S{t) — >■ 3F Xx.F{x) = Xx.^{s) A Xx.F{x) = Xx.S{t) 

if 7 and S are not free variables and belong to different theories, where F is 
a new variable of appropriate type. 



The above rules obviously terminate and yield a problem which is equivalent 
to the original problem and where all the equations are pure, i.e., containing 
only symbols from a same theory. We can now split a unification problem into 
several pure subproblems: 

Definition 5. A unification problem P will be written in the form 



P = (3A) Pp A Pv A To A Pi A ■ ■ ■ A Pn 



where 

— Pp is the set of equations of the form Xxf.Ffxf) = Xx^.F^afi^), where tt is 
a permutation of {1, . . . ,n} . Such equations will be called frozeiij. 

— Pv is the set of equations of the form Xx.Fffj) = Xx.Gfz), where F and G 
are different free variables. 

— Pq, . . . ,Pn are pure problems in the theories Eq, . . . , En respectively. 

2 A Combination Algorithm 

In this section, we will present a non-deterministic algorithm, one step beyond 
that by Baader and Schulz who extend the method initiated by Schmidt- 
SchauB for unification in combinations of equational theories. In particular, 

® These equations are always trivially solvable, for example by P i— >■ AaTf.C', where C 
is a variable of the appropriate base type, but we will never solve them explicitly 
because they have no minimal complete sets of unifiers and their solutions introduce 
terms which are not patterns, see nag. 
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we will guess a projection for each free variable, and then restrict our attention 
to constant-preserving substitutions, like we suggested in 

The aim of these rules is to guess in advance the form of a unifier, and to 
make the algorithm fail when the solutions of the problem do not correspond 
to the choices that are being considered currently. The drawback of such an 
approach is a blow-up in the complexity, but it allows to avoid recursion, hence 
guaranteeing termination. 

After the variable abstraction step, we want to guess, for each free variable 
F of arity k {i.e., whose 77 -long form is Xxk.F{xi, . . . ,Xk)) which of the bound 
variables Xi, ... ,Xk will effectively participate in the solution. 

Definition 6. A constant-preserving substitution is a substitution a such that 
for all F G Dom{u) if Fafp^ = Xxk.s then every variable of Xk has a free 
occurrence in s. A projection is a substitution of the form 

(t = {F h> A^.F'(^) I F G Dom{a), C {x^}} 

Lemma 2. For every substitution a, there exist a projection n and a constant- 
preserving substitution 6 such that 

Lemma 3. The equation Xx.s = Xx.t where {x} C\ TV {s) ^ {x} C\ TV {t) has no 
constant-preserving E-solution. In particular, the equation Xx.F(%j) = Xx.G{z), 
where {y} and {z} are not the same set, has no constant-preserving E-solution. 

This will allow us to choose (and apply) a projection for the free variables 
and to discard in the sequel the problems whose solutions are not constant- 
preserving. 

2.1 Non-deterministic Choices 

At this point, we will guess through “don’t know” nondeterministic choices some 
properties of the solutions. The idea is that once a choice has been made, some 
failure rules will apply when the solutions of the current problem correspond to 
other choices. This method initiated by Schmidt-Schaufi m is obviously correct 
because the solutions of the problems that are discarded this way correspond to 
another choice and will be computed in the corresponding branch. 

The following transformations have to be successively applied to the problem: 

C.l Choose a projection for the free variables 

We first guess for a given solution a, the projection of which tr will be an instance 
by a constant-preserving substitution, in the conditions of Lemma 0 This is 
achieved by applying nondeterministic ally the following rule to some of the free 
variables F of the problem: 



Project P {3F') F = Xxn.F'{yk) A P{F 1 -^ Xxn.F'(yk)} 
where F has arity n and F' is a new variable and {yk} C {xX\ 



After this step, we can restrict our attention to constant-preserving solutions. 
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C.2 Choose some flex-flex equations 

We now guess the equations of the form Xx.F{ij) = Xx.G{z) that will be satisfied 
by a solution a. This is done by applying the following rule to some pairs {F, G} 
of the free variables of the problem: 



FF^ P F=Xx.G{x^) A P{F ^ Xx.G{x^)} 

where tt is a permutation of n}, F has type ti r„ —>■ r, G 

has type T,r(i) r^(n) t, F ^ G and F and G occur in P. 



We restrict the application of this rule to pairs of variables of the same arity 
and of the same type (up to a permutation of the types of the arguments), 
because after applying Project, the only fiex-fiex equations admitting constant- 
preserving solutions are of this form. 

C.3 Choose the permutations on the arguments of the variables 

For each free variable F of type Ti —)> •••—:► r„ —>■ t, we choose the group 
of permutations Perm(F) such that a solution cr satisfies AxF.Fcr(xF) =v 0 e 
X x^.Fa{x^) for each tt G Perm(F). For this, we apply the following rule to 
some of the free variables F of the problem: 



FF= P XXn.F(Xn) = XXn-F{Xn'^) A P 

where F is a free variable of P of type Ti — ■ 


' ' ^ Tn ^ T and 7T is a 


permutation such that r,r(i) =Ti for 1 < i < n. 





C.4 Apply as long as possible the following transformation: 



Coalesce 

= A^.G(zlf) A P -A F = Xy^.G{z:;t) A P{F h> Aylf.G(zll)} 
ii F ^ G and F,G £ FV{P), where is a permutation of 2F- 



After Project has been applied, the arity of the values of the variables is fixed, 
hence two variables may be identified only if they have the same arity. Note that 
F = Xy^.G{z^) is solved after the application of Coalesce. After this step, we 
have an equivalence relation on the variables, and a notion of representative: 

Definition 7. Two variables F and G are identified in P if they appear in 
an equation Xx.F{y) = Xx.G(z) of P. The relation =v is the least equivalence 
containing any pair of identified variables. 

Assume that Coalesce has been applied as long as possible to P. In an equiva- 
lence class of =v , only a single variable may occur more than once in P. When 
such a variable exists, it is chosen as a representative for all variables in that 
class. Otherwise, the representative is chosen arbitrarily in the equivalence class. 
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C.5 Choose a theory for the representatives 

We now guess for each representative F, the theory Ei such that F is allowed 
to have the head symbol of its value by a solution in E^. Again, this was already 
done by Schmidt-Schaufi m and by Baader and Schulz. 

C.6 Choose an ordering on representatives 

Finally, we guess a total strict ordering compatible with the occur-check relation 
defined by F<G if Ga is a proper subterm of Fa. Choose a total ordering <oc on 
the representatives of the variables of the problem. This is exactly what Baader 
and Schulz do in the first-order case P, reflecting the fact that if u is a finite 
solution, then the occur-check relation must be acyclic. 

2.2 Solving Pure Subproblems 

We make now precise our assumptions on the elementary Ei unification algo- 
rithms. First, we take note of the fact that there is not much to do with the 
frozen equations of Pp' 

Frozen Equations 

Although they are always trivially solvable in the free theory, we will never try to 
solve the equations of the form Xx.F{x) = Xx.F{x^) of Pp- These equations will 
be kept as constraints because they do not have finite complete sets of unifiers 
even for theories which have finitary first-order unification, and their solutions 
introduce terms which are not patterns. Here is an example by Qian and Wang: 

Example 1 m)- Consider the equation Xxy.F{x, y) = Xxy.F{y, x) in the AC- 
theory of -I-. For to > 0, the substitution am = {F ^ Xxy.Gm{Hi{x,y) + 
Fli{y,x), . ■ ■ , Hm{x, y) + Hm{y, a^))} is an AC-unifier of the above equation. On 
the other hand, every solution of e is an instance of some ai. In addition an+i 
is strictly more general than cr„. 

Hence, AC-unification of patterns is not only infinitary, but nullary, in the 
sense that some problems do not have minimal complete sets of AC-unifiers m- 
All we can do is to make sure that the frozen equations in Pp are compatible 
with a (DAG-) solved form of the problem: 

Definition 8. Given a conjunction Pp of flex-flex equations of the form 
Xx.F(x) = Xx.F{xP), we will write Pp ^ s =r)f}E t if = tffp can he proved 
using the axioms of E and the equations Xx.F{x) = Xx.F[xP) of Pp, where F is 
treated like a free algebraic symbol. A substitution a is compatible with Pp if for 
all equation Xx.F{x) = Xx.F{x^) of Pp, Pp ^ Xx.Fafx) =ri/3E Xx.Fafx'^). 



Lemma 4. If a substitution a (seen as a conjunction of equations) is compatible 
with Pp as defined above, then the E-solutions of a A Pp are the substitutions 
aO, where 9 is an E -solution of Pp. 
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Definition 9. Given a unification problem P, with no flex-flex equations and 
a conjunction Pp of (frozen) equations of the form \x.F{x) = Xx.F(xP), a con- 
strained i?-solved form of P is P' A Pf where 

— P' is a solved form with mgu a, containing no equations of the form 
Xx.Ffx) = Xx.F(x^). 

— P'p contains Pp plus some equations of the form Xx.G{x) = Xx.G(xP), where 
G is a new variable not E -instantiated in P' . 

— P'p ^ sa = 7 j/ 3 e ta for every equation s = t of P. 

— a is compatible with P'p. 

In this case, a constrained by P'p, denoted by a \P'p is called a constrained E- 
unifier of P A Pp. The solutions of a \P'p are the substitutions of aO where 9 is 
a solution of P'p . 

Definition 10 (Solve rule for elementary theories). 

A Solve rule for the theory Ei is an algorithm that takes as input a problem Pi, 
pure in Ei and a conjunction Pp of frozen equations (as in definition^) and 
that returns P( and P(p such that 

1. P' is a solved form with mgu a constant-preserving substitution o0. 

2. P' has no flex-flex equations. 

3. P(p contains the equations of Pp plus some only flexible-flexible equations 
of the form Xx.P{x) = Xx.F(x"), where P ^ FV{Pi) U Dom{a). 

4-. F can be Ei-instantiated by a only if Ei has been chosen as the theory of F 
at the step cm 

5. Fa can be of the form Xx.c[G{- ■ •)], where F,G € FV{Pi) and another theory 
than Ei has been chosen for G at the step Cm only if F <oc G, for the 
ordering chosen at the step Cm 

6. a is compatible with P'p. 

1. P(p ^ Xx.s =E- Xx.t for all the equations Xx.s =Pi Xx.t of Pi. 

Proposition 1. Let s = t be an equation, pure in the theory Ei, and let a be 
an E-solution of s = t. Then there exists a set of equations Pperm of the form 
Xx'' .E{x) = Xx^ .E(x) , and two substitutions api and 9 such that 

— a =E api9. 

— api is pure in the theory Ei, 

— 9 is an E-solution of Pperm- 

Pperm\^^^Ei — vP^i ^O^Eij 

~ if F € T>om.{a) and there exists a permutation tt such that Xx.F(x)a=pfjp 
Xx.F(x'")a, then Pperm\=Xx.F(x)aE.=rjf 3 E^ Xx.F(x")aE^. 

The result is obtained by using Theorem|21and adapting the proof of Theorem 
5.1 of 0, which is the corresponding theorem in the first-order case. 

The correctness will be preserved if one allows non-constant-preserving substitutions, 
but the redundancy of the complete sets of unifiers will be increased in this case. 
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The Algorithm 



Algorithm for pattern unification modulo i?o U • • • U 

1. Apply as long as possible the rules VA and Split of section E 

2. Perform successively the steps Cfllto CEl 

3. Apply a Solve rule for theory Ei to each accordingly to definition mi 

4. Return A A • • • A A Pp A 



Theorem 3. Given an equational theory E = Eq G ■ ■ ■ G E„, where the Eis 
are defined over disjoint signatures Tq, . . . ,Tn and a unification problem P, 
containing only algebraic symbols of Pq U U 

— The above algorithm returns a constrained DAG-E -solved form of P. 

— Every E-unifier of P is a solution of a constrained DAG-solved form com- 
puted by the above algorithm. 



Corollary 1. Unifiability of higher-order patterns is decidable in combinations 
of theories having a Solve rule. 



3 A Solve Rule for Some Syntactic Theories 

In we show how to do pattern unification for a narrow class of theories: 
a subset of the simple syntactic theories. For lack of space, we just give here 
some hints on how to design a Solve rule for the free theory, the theory of 
left-distributivity LD and the commutativity C. These three theories are simple 
theories, i.e., they have no equality between a term and one of its proper sub- 
terms. As it is well-known from the works on first-order unification, compound 
cycles or theory conflicts cannot be solved in such theories. It is easy to show 
that the following two rules are correct for simple theories: 



Clash F = s — t _L 

if F is Pi-instantiated and Ej, jfi^i has been chosen for F at CJ5I 
Cycle F = c[G] _L 

if c is a non-empty context and F -fioi-G for the ordering chosen at CJHl 



Now, the free theory and LD have their symbols decomposable {i.e., 
f{si,...,Sn) =E f{ti,...,tn) iff Si =E U) and C enjoys a similar property: 
S1-I-S2 =c t\-\-t2 iff Si =cli A S2 =ct2 or Si =ct2 A S2 =cti- Hence, the rules for 
testing the compatibility of a solved form with a frozen equation are: 
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Fail \x.F{x) = Xx.F{x'^) — >• T 

if F is not a new variable and tt ^ Perm(F') as chosen at the step 00 


Dec-Propagate 

F = Xx.f{^) A Xx.F{x) = Xx.F{x'^) — >■ 

F = Ax./(^) Akj<„ Ax.Si = Ax’".Sj 
if / is a decomposable constant or a bound variable. 




C-Propagate 

F = Xx.si -I- S2 A Xx.F{x) = Xx.F{x'^) — >■ 

F = Xx.Si -I- S2 A ((Air. Si = Air’^.Si A Xx.S 2 = XxF 
V (Air. Si = Xx^ .S 2 A Air.S2 = Air’^ 
if -|- is a commutative algebraic symbol. 


S2) 

•si)) 



The Mutate rule of together with the rule Coalesce and a failure rule 
when two (non-new) variables are identified allow us to compute a solved form 
satisfying the conditions 1 to 3 and 7 of definition E3 The first of the two above 
sets of rules allows us to fulfill conditions 4 and 5, and the second, condition 6. 

4 Prom AC to Abelian Groups 

In this section, we consider the associativity-commutativity, AC and some of 
its usual extensions ACU (AC with unit), AG (the Abelian groups) and ACUN 
( ACU with nilpotence) . For lack of space, we only give the flavor of a Solve rule 
for these theories. Some more details can be found for AC in our previous paper 

0 - 

In the first order case, the unification algorithm consists of counting the num- 
ber of times an immediate subterm from another theory occurs in each side of an 
equation: both sides must have the same number of occurrences. We associate 
with each algebraic variable x an integer variable Xt representing the number 
of times the value of x contains the term t as an immediate subterm, and we 
translate each equation between two terms into a linear equation over the in- 
tegers. These linear equations have to be solved over different integer domains 
depending on the considered theory. Then the solutions for the unification prob- 
lem are built from the integer solutions, modulo some restrictions, in order to get 
some “well- formed” terms, and a complete set of unifiers. Thanks to Theorem 
121 the same approach can also be used in the pattern case, as shown in |S| for 
the AC case. The main difference comes from the bound variables: if \x.F{x) 
introduces a term t{x), then Xx.F(x^) introduces t{x'^), and we do not know a 
priori whether t(x) and t{x'^) are equal or not. This is exemplified below: 

4.1 An Example of AC(-|-)-Unification Problem 

Consider the equation £ = Xx3.2F{x3) -I- F{x 3 ^) -I- 9G(x3) = Xx3.2H{x3) where 
7T = {1 !->• 2; 2 !->• 3; 3 !->■ 1}, to be solved modulo AC(-I-). If F introduces a times 
the term ^(^3), then F'^ introduces a times the term If <(^3) and t{x^) 
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are distinct, we have to count also the number of times F introduces and 

finally, we have to count the number of t{x 3 ^ ). Then we can stop since is 
the identity. Let us denote by 5 ( 7 r) the group of permutations generated by tt. 
The above unification problem is translated into 2 subsystems: 

Sg{Tr) =2a + a +9/3 = 2y = < 2a^ + + 9/3^ = 2y^ 

[ 2a'^2 + Q -' t , + 9 / 3^2 = 27^2 

where F (resp. G, F[) introduces a (resp. /3, 7 ) times a term t{x^) such that 
Xxz-t{x^) =E \xz-t{xz^), and (resp. /3(,, 7 ^,) times s(x 3 ‘^), = id, tt, where 

Xxii.sidFi) These two systems are solved over non-negative inte- 

gers, and as in the first order case, the unifiers are built from the Diophantine so- 
lutions, with the main difference that in the pattern case the introduced variables 
LiS corresponding to a solution of S'g(,r) are constrained by Perm(Li) = t/( 7 r). 

In the same spirit, this can be done in the extensions of AC, such as ACU, 
AG and ACUN, where the equations are solved by counting how many times a 
variable introduces a given term. 



4.2 Handling the Additional Constraints for a Solve Rule 

Let us assume now that the problem to be solved is a part of a combination 
problem and have to be solved modulo the conditions of definition cni Each 
variable F comes with some additional assumptions, such as the theory in 
which it can be instantiated, and its group of permutations Perm(E) as de- 
fined in section |3 These constraints are also translated into linear constraints 
over integers. Indeed the constraint Perm(iL) = "H corresponds to the equations 
Xx 3 .H{x 3 ) = Xx 3 .H{x^), where G 33, and these equations are translated into 
a system of linear equations over variables exactly in the same way as before, ex- 
cept that we have to consider all subgroups of the permutation group generated 
by TT and 33 as possible invariants for an introduced term. 

A constraint like “the variable G cannot be instantiated in the considered 
theory” means here that the number of terms from another theory introduced by 
its value is exactly one. Only one among all of the integer variables corresponding 
to G has to be equal to 1, the others being null. 

A constraint like H </ioc G will be treated in a second step, after one has built 
the solutions to the unification problem. If we get a solution where G occurs in 
the value of 33, this is due to the fact that G is (equal to a new variable which 
is) associated with a solution which has some non-zero values for some integer 
variables corresponding to 33. In the AC and ACU cases, such a integer solution 
has to be discarded, while in the AG and ACUN cases, this problem can be fixed 
by computing a particular solution such that these integer variables are null. 

In all the 4 cases of AC, ACU, AG and ACUN, we are able to get a solved 
form which satisfies the additional hypotheses of the Solve rule. 
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5 Conclusion 

We believe that with the emergence of higher-order rewriting and higher-order 
logic programming, there will be a use for pattern unification modulo equational 
theories. The algorithm that we proposed here is meant to provide a decidability 
result: it will not behave satisfactorily in practice, due to the heavy nondeter- 
minism. It will be necessary to investigate how to reduce the nondeterminism as 
we did in the first-order case in m- Another issue of interest will be to develop 
matching algorithms which should be dramatically more efficient in practice. 

Although our method for elementary unification works well for the AC-like 
theories, we do not have a general method for ensuring the compatibility of a 
unifier with an equation of the form \xy.F{x,y) = \xy.F{y,x). For instance, 
the known methods for unification in Boolean rings do not use equations over 
the integers, and such equations do not translate naturally as shown in the 
previous section. Actually, we conjecture that there exists a theory with decidable 
unification of problems with linear constant restriction (the equivalent in the 
first-order case of our Solve rule fQ) undecidable pattern unification. 
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Abstract. In this paper we give a simple and uniform presentation 
of the rewriting calculus, also called Rho Calculus. In addition to 
its simplicity, this formulation explicitly allows us to encode complex 
structures such as lists, sets, and objects. We provide extensive examples 
of the calculus, and we focus on its ability to represent some object 
oriented calculi, namely the Lambda Calculus of Objects of Fisher, 
Honsell, and Mitchell, and the Object Calculus of Abadi and Cardelli. 
Furthermore, the calculus allows us to get object oriented constructions 
unreachable in other calculi. In summa, we intend to show that because 
of its matching ability, the Rho Calculus represents a lingua franca to 
naturally encode many paradigms of computations. This enlightens the 
capabilities of the rewriting calculus based language ELAN to be used as 
a logical as well as powerful semantical framework. 



1 Introduction 

Matching is a feature provided implicitly in many, and explicitly in few, 
programming languages. In this paper, by making matching a “first-class” 
concept, we present, experiment with, and show the expressive power of a new 
version of the rewriting calculus, also called Rho Calculus (pCal). 

The ability to discriminate patterns is one of the main basic mechanisms the 
human reasoning is based on; as one commonly says “one picture is better than 
a thousand explanations”. Indeed, the ability to recognize patterns, i.e. pattern 
matching, is present since the beginning of information processing modeling. 
Instances of it can be traced back to pattern recognition and it has been 
extensively studied when dealing with strings m , trees IE] or feature objects | 2 |. 

Matching occurs implicitly in many languages through the parameter passing 
mechanism but often as a very simple instance, and explicitly in languages like 
PROLOG and ML, where it can be quite sophisticated \ZHtZ7\ . It is somewhat 
astonishing that one of the most commonly used model of computation, the 
lambda calculus, uses only trivial pattern matching. This has been extended, 
initially for programming concerns, either by the introduction of patterns in 
lambda calculi EEam, or by the introduction of matching and rewrite rules 
in functional programming languages. And indeed, many works address the 
integration of term rewriting with lambda calculus, either by enriching first- 
order rewriting with higher-order capabilities, or by adding to lambda calculus 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 77WII 2001. 
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algebraic features allowing one, in particular, to deal with equality in an efficient 
way. In the first case, we find the works on CRS I2n] and other higher-order 
rewriting systems , in the second case the works on combination of lambda 

calculus with term rewriting to mention only a few. 

Embedding more information in the matching process makes it appropriate 
to deal with complex tasks like program transformations 1201 or theorem 
proving m- In that direction, matching in elaborated theories has been 
also studied extensively, either in equational theories EE] or in higher-order 
logic where it is still an open problem at order five. 

Matching allows one to discriminate between alternatives. Once the patterns 
are recognized, the action to be taken on the appropriate pattern should be 
described, and this is what rewriting is designed for. The corresponding pattern 
is thus rewritten in an appropriate instance of a new one. The mechanism that 
describes this process is the rewriting calculus. Its main design concept is to 
make all the basic ingredients of rewriting explicit objects, in particular the 
notions of rule application and result. By making the application explicit, the 
calculus emphasizes on one hand the fundamental role of matching, and on the 
other hand the intrinsic higher-order nature of rewriting. By making the results 
explicit, the Rho Calculus has the ability to handle non-determinism in the sense 
of a collection of results: an empty collection of results represents an application 
failure, a singleton represents a deterministic result, and a collection with more 
than one element represents a non-deterministic choice between the elements of 
the collection. 

For example, assuming a, b, and c to be different constants, in the rewriting 
calculus the application of the rewrite rule a — >■ 6 on the term a is expressed by 
the term (a — >■ 6)*a. This term is evaluated into the term b. The application of 
same rule on the term c is expressed by the term (a — >■ b)'c and it evaluated to 
null, therefore memorizing the fact that the rule is not applicable since the term 
a does not match the term c. Of course one can use variables and the simplest way 
to do it is just with rewrite rules whose left-hand side is a single variable term, 
like in (X — ^ 6)*c. Such a rule always applies and the previous term evaluates to 
c. This trivial rule application corresponds indeed exactly to the lambda calculus 
application: the previous term could be written as (XX.b) c. This enlightens one 
of the nice feature of the rewriting calculus to abstract not only on variables like 
in the previous example, but also on arbitrary terms, including non-linear ones. 

This matching power of the calculus provides important expressivity 
capabilities. As suggested by the previous example, it embeds lambda calculus, 
but also permits the representation of term rewrite derivations, even in the 
conditional case. More generally, it allows us to describe traversal, evaluation 
and search strategies like leftmost innermost, or breath first [Jj. In [J], we have 
shown that conditional rewrite rules of the form “I — >■ r if c” can be faithfully 
represented in the rewriting calculus as I — >■ (True — >■ r)’(strat’c), where strat 
is a suitable term representing the normalization strategy of the condition. 

Rewriting is central in several programming languages developed since the 
seventies. Amongst the main ones let us mention OBJ [ 1 8j . ASF-I-SDF pni, 
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Maude ^0|, CafeOBJ fSI, Stratego |SE|, and ELAN [l24l,‘-{IM4j which has been at 
the origin of some of the main concepts of the rewriting calculus. In turn, the 
Rho Calculus provides a natural semantics to such languages, and in particular 
to ELAN, covering the notion of rule application strategy, a fundamental concept 
of the language. 

In this paper, we give the newest description of the Rho Calculus, as 
introduced in 0. It provides a simplified version of the evaluation rules of the 
calculus as well as a generic and explicit handling of result structures, a point 
left open in the previous works [tip?] . 

The contributions of this paper are therefore the following: 



— we provide a broad set of examples showing the expressiveness of the 
Rho Calculus obtained mainly thanks to its “matching power” and how this 
makes it suitable to uniformly model various paradigms of computation; 

— we show how the matching power of the Rho Calculus allows us to encode 
two major object-calculi which have strongly influenced the type-theoretical 
research of the last five years: the Object Calculus {<iObj) of Abadi and 
Cardelli and the Lambda Calculus of Objects of Fisher, Honsell, and 
Mitchell HH (xobj). Moreover, we show two examples in Rho Calculus that 
cannot be encoded in the above calculi. 



Road Map of the Paper. The paper is structured as follows: in Section |3 we 
present the syntax and the small-step semantics of the Rho Calculus; Section 0 
presents a plethora of examples describing the power of matching; Section 0 
presents the encoding of the Lambda Calculus of Objects and of the Object 
Calculus in the Rho Calculus. Conclusions and further works are finally discussed 
in Section El An extended version of the paper can be found in . 



2 Syntax and Semantics 

Notational Conventions. In this paper, the symbol t ranges over the set T of 
terms, the symbols S, X,Y, Z, . . . range over the infinite set V of variables, the 
symbols null, ©, o, o, 6, . . . , z, 0, 1, 2, . . . range over the infinite set C of constants 
of fixed arity. All symbols can be indexed. The symbol = denotes syntactic 
identity of objects like terms or substitutions. We work modulo a-conversion, 
and we follow the Barendregt convention that free and bound variables have 
different names. 



2.1 Syntax 

The syntax of the pCal is defined as follows: 

t ::= a \ X \ t ^ t \ t't \ plain terms 

null I t, t structured terms 

The main intuition behind this syntax is that a rewrite rule t — >■ t is an 
abstraction, the left-hand-side of which determines the bound variables and 
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some pattern structure. The application of a pCal-term on another pCal-term 
is represented by . The terms can be grouped together into a structure built 
using the operator and, according to the theory behind this operator, different 
structures can be obtained. The term null denotes an empty structure. 

We assume that the application operator associates to the left, while the 
and the operators associate to the right. The priority of the application 
is higher than that of the operator which is in turn of higher priority 
than the operator. 

Definition 1 (Some Type Signatures and Abbreviations). 

^ :T X T => T ti-t2 = self-application 

• :T X ^ QjTid Jiifict'ioTi-(xj}j)lic(it%oTi (tz € IN^ 

I -T X T ^ T = ti,. . . ,tn structure {n G IN) 

We draw the attention of the reader on the main difference between denoting 
the application, and denoting the object-oriented self- application operator. 



2.2 Matching Theories 

An important parameter of the pCal is the matching theory T. We give below 
examples of theories T defined equationally. 



Definition 2 (Matching theories). 

— the Empty theory T0 of equality (up to a-conversion) is defined as the 
following inference rules: 



tl = t2 fa = ts 
ti = is 



(Tra) 



tl = t2 
t2 = tl 



(Sym) 



tl = t2 

t3[tl]p = t3[t2]p 



(Ctx) 






where ti\t2]p denotes the term ti with the term t2 at position p. The a- 
conversion definition follows the standard intuition and is made precise for 
pCal in 

— the theory of Commutativity Trc(j) (resp. Associativity is defined as 

T0 plus the following inference rules: 

(Com) (Ass') 

/(tl t2) = f{t2 tl) ^ /(/(tl t2) ta) = /(tl /(t2 ts)) ^ 



— the theory of Idempotency is defined as T0 plus the axiom f(t t) = t. 

— the theory of Neutral Element Ttv(/o) is defined as T0 plus the following 
inference rules: 

/(O t) = t fit 0) = t 



— the theory of the Lambda Calculus of Objects, is obtained by 

considering the symbol as associative and null as its neutral element, 
i.e.: 

Tao6j = Tx(,) U Tjv( „„ii) 
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— the theory of the Object Calculus, Tc^obj, is obtained by considering the symbol 
as associative and commutative and null as its neutral element, i.e.: 

T.; 06 j = Ta(.) U Tc(.) U TAf(_nuii) = T\obj U Tc(,) 

Other interesting theories can be built from the above ones, such as e.g. 
'^MSet{f,nii)i and Set{f,nii) IZ]- For the Sake of completeness, we include in 
the paper the definition of syntactic matching, which can also be found in [7|, 
together with more explanatory examples. 

Definition 3 (Syntactic Matching). For a given theory T over pCal-terms: 

1. a T-match equation is a formula of the form ti <Ct ^ 2 / 

2. a substitution a is a solution of the T-match equation t\ <Ct ^2 if crti =t O; 

3. a T-matching system is a conjunction of T-match equations; 

4-. a substitution a is a solution of a T-matching system if it is a solution of all 
the T-match equations in it; 

5. a T-matching system is trivial when all substitutions are solution of it and 
we denote by ¥ a T-matching system without solution; 

6. we define the function Sol on a T-matching system T as returning the 
-<-ordere^ist of all T-matches of T when T is not trivial and the list 
containing only did, where did is the identity substitution, when T is trivial. 



Notice that when the matching algorithm fails {i.e. returns F), the function Sol 
returns the empty list. A more detailed discussion on decidability of matching 
can be found in |Zj. 

For example, in Tg, the matching substitution from a pCal-term. t\ to a 
pCal-teim. t 2 can be computed by the rewrite system presented in Figure ^ 
where the symbol A is assumed to be associative and commutative, and Oi, 02 
are either constant symbols or the prefix notations of or or . 

Starting from a matching system T, the application of this rule set terminates 
and returns either F when there are no substitutions solving the system, or a 
system T' in “normal form” from which the solution can be trivially inferred | 23 |. 
This set of rules could be extended to deal with more elaborated theories like 
commutativity. 



2.3 Operational Semantics 

For a given total ordering ^ on substitutions (which is left implicit in 
the notation) and a theory T, the operational semantics is defined by the 
computational rules given in Figure |21 The central idea of the main rule of 
the calculus {p) is that the application of a rewrite rule t\ — >■ t 2 at the root 
position of a term t^, consists in computing all the solutions of the matching 
equation {t\ <Ct ta) in the theory T and applying all the substitutions from the 
^-ordered list returned by the function Sol{ti <Ct ta) to the term ^ 2 - When there 



^ We consider a total order ^ on the set of substitutions 0. 
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Ol(tl . . . tn) ^T0 • tm) 

{X <Tg t) A (X <T0 t') 

t -Ct0 X 
F A (t <Ct0 t') 



A <T 0 t'i if Oi = 02 and 

i = l. . .72 

F 



X ^T0 t 
F 



F 

F 



otherwise 

if t =T0 t' 

Otherwise 

if t ^ V 



Fig. 1. Rules for Syntactic Matching 



(p) (c 


— >■ t2)’tz 


>— Z-T 


(e) 


[tl,t2)’ti 




in) 


null’t 


1— >-T 



null if ti fs has no solution 

ait 2 , . . . , (Jnt 2 if (Ti € Sol{tl fs), (Tl ^ (Ti+1, U < OO 



Fig. 2. Evaluation rules of the pCal 



is no solution for the matching equation t\ <Ct ts, the special constant null is 
obtained as result of the application. Notice that in some theories, there could 
be an infinite set of solutions to the matching problem {ti <Ct A); possible ways 
to deal with the infinitary case are described in [S| . 

The other rules (e) and (z/) deal with the distributivity of the application on 
the structures whose constructors are and null. When the theory T is clear 
from the context, its denotation will be omitted. Notice that if ti is a variable, 
then the (p)-rule corresponds exactly to the (/3)-rule of the lambda calculus. 

With respect to the previous presentation of the Rho Calculus |Z], we have 
modified the notation of the application operator which was denoted [-](-), but 
more importantly, the evaluation rules have been simplified on one hand, and 
generalized to deal with generic result structures on the other hand. 

As usual, given a theory T, we denote by =p the smallest reflexive, symmetric, 
and transitive relation containing i— >-t, stable by context and substitution. When 
working modulo reasonably powerful theories T, the evaluation rules of the pCal 
are confluent: 

Theorem 1 (Confluence in T 0 ). Given a term t\ such that all its abstractions 
contain no arrow in the first argument, if t\ HfT0 ^2 and t\ H»t0 A then there 
exists a term t^ such that t2 ^-^fT0 A and t^ (-»t0 A • 

Proof. The proof follows the same lines defined in E 

3 Examples in Rho Calculus 

In the following section we present some simple examples intended to help the 
reader in the understanding of the behavior of the Rho Calculus. 
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Example 1 (In Tgj. 

1. The application of the simple rewrite rule a — >■ 6 to a, i.e. (a — ^ b)'a, is 
evaluated to b since Sol{a <Ct^ a) = fid and ai^b = b; 

2. The matching between the left-hand side of the rule and the argument can 
also fail and in this case the result of the application is the constant null, 

i.e.: (a — >■ b)’c lA null; 

3. When the left-hand side of a rewrite rule is not a ground term, the matching 
can yield a substitution different from aid, e.g. {X — X)’a A [X/a\X = a; 

4. The non-deterministic application of two rewrite rules is represented by the 
application of the structure containing the respective rules: 

{X X{a),Y Y{h)Yc A (Jf ^ X(a))*c,(y ^ Y{b)Yc A [X/c]X(a), 
[Y/c]Y{h) = c{a),c{h); 

5. The selection of the field cx inside the record structure (cx — >■ 0, cy — >■ 0) 
evaluates to the term {0,null), i.e.: (cx — >■ 0,c?/ — >■ 0)*ca: A (cx — >■ 0)*ca;, 
{cy — >■ 0)*ca; A (0, null); 

6. Functions are first-class entities in the pCal: {X — >■ (X*a))* (F — >■ T) A 
(Y — )> Y)’a A a; 

7. The lambda calculus with patterns can be easily represented in the 
pCal. For instance, the lambda-term XPair{X Y).X , can be represented 
and reduced as follows: {Pair{X Y) — >• X)’Pair{a b) A [X/a,F/6]X = a; 

8. Starting from the fixed-point combinators of the lambda calculus, we can de- 
fine a pCal-ievvd that applies recursively a given pCoTterm. We use the clas- 
sical fixed-point Y\ = (A;, yl;,) with A\ = XX.XY.Y{XXY), which can be 
translated as Yp = Ap’Ap with Ap ^ X ^ Y ^ Y’{X’X’Y). Then: Yp’t = 

Ap’Ap’t ={X ^ Y ^ Y-{X’X’Y))’Ap’t A (F ^ Y’{Ap-Ap-Y))-t A 

t'{Ap' Ap’t) = t'(Yp’t). Starting from the Yp term, we can define more elab- 
orated terms describing, for example, the repeated application of a given 
term or normalization strategies according to a given rewrite rule [Z|; 

9. Let car = X,Y X, cdr = X,Y — >■ F, cons = X — >• F — >• (X, F). It is easy 
to check that car(a, b, c, null) i— »■ a, and that cdr{a, b, c, null) b, c, null, and 
that cons{d a, b, c, null) d, a, b, c, null. 



Example 2 (In T^, Tc, TaC: andTff(^fO'j). 

1- (Ta(o)) The application of the rewrite rule o(X F) — >• X to o(a o (6 o (c d))) 
reduces, thanks to the associativity of o, to (a,o(a b),o(a o (b c))); 

2. (Tc(©)) The application of the rewrite rule ©(X F) — >• X to ©(a b) 
reduces, thanks to the associativity-commutativity of ©, to (a, b), a structure 
representing all possible results; 

3. (T^c:(®)) The application of the rewrite rule ©(X © (X F)) — >• ©(X F) to 
©(a © (& © (c © (a d)))) reduces to ©(a © (6 © (c d))). The search for 
the two equal elements is done by matching thanks to the associativity- 
commutativity of the © operator, while the elimination of doubles is 
performed by the rewrite rule; 
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4. (Tjv(/o)) Using a theory with a neutral element allows us to “ignore” 
variables from the rewrite rules. For example, the rewrite rule X (B a(BY — > 
X © 6 0 y replaces an a with a & in a structure built using the “©” operator 
and containing one or more elements. The application of the previous rewrite 
rule to 5 © a © 6 reduces to 6 © & © 6 and the same rule applied to a leads to 
b, since a =T„(g, 0 ) 0 © a © 0. 

The next example shows how the object oriented paradigm can be easily captured 
in the T\obj- In particular we focus our example on the usage of the pseudo- 
variable this which is crucial for sending messages inside method bodies. In the 
pCal, a method is seen as a term of the shape m ^ S ^ tm, where m is the 
name of the method, S' is a variable playing the role of this and tm is the body 
of the method that can contain free occurrences of S. Sending a message m to a 
structure {i.e. an object) t is represented via the alias t.m, i.e. t'm’t. Intuitively, 
if the method m exists in the structure t, then its body tm can be executed with 
the binding of S to the object itself. This type of application is also called, in 
the object-oriented jargon, self-application, and it is fundamental for modeling 
mutual recursion between methods inside an object. 

Example 3 (In T\obj)- 

1. This example presents a simple object t with only one method a that 
do not effectively use the variable S. Let < = a — ?> S' — ?> 6. Then, 
t.a = t'a’t A (i7id(S — )> h))’t = (S — >■ b)'t lA 6; 

2. This example presents an object t with a non-terminating method w. Let 
t = w — >■ S — >■ S.uj. Then, t.uj A (S — >■ S.uj)’t A t.u . . . ; 

3. We consider another object with a non-terminating behavior consisting of 

two methods ping and pong, one calling the other via the variable S. Let 
t = {ping — >■ S — >■ S. pong, pong — >■ S — >■ S.ping). Then, t.ping = t’ping’t A 
{{ping — >■ S — >■ S. pong)’ ping, {pong — )> S — >■ S.ping)’ping)’t A(S—>' 
S. pong)’ t, null =Txa,j {^ S.pong)’t A t .pong t .ping ^ . . . 

In the above example, we can notice how natural the use of matching is for 
directly selecting the method name. Starting from these simple examples, we 
can now imagine how matching can be use in its full generality {i.e. allowing 
variables as well as appropriate equational theories) in order to deal with more 
general objects and methods. The purpose of the rest of this paper is to make 
these aspects precise. 

4 Object-Based in Rho Calculus 

In this section we focus on two major object-calculi which have influenced the 
type-theoretical research of the last five years: 

— The Lambda Calculus of Objects of Fisher, Honsell, and Mitchell CH; 

— The Object Calculus of Abadi and Cardelli p. 
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Both calculi are prototype-based i.e. they are based on the notion of “objects” and 
not of “classes” . Nevertheless, classes can be easily encoded as suitable objects. 
Those calculi have been extensively studied in a “typed” setting where the main 
objective was to conceive sound type systems capturing the unfortunate run- 
time error message-not-understood which happen when we send a message m 
to an object which do not have the method m in its interface. 

As previously shown in Example 0 structured-terms are well suited to 
represent objects and to model the special pseudovariable this. In order to 
support the intuition, we start by showing the way some classical examples of 
objects can be easily expressed in the pCal. 

Example 4 Point Object Encoding in T\oi,j). Given the symbols val, get, set 
and V (used to denote pairs), an object Point is encoded in pCal by 

val^ r;(l l),get^ S.val, set ^ S ^ v{X Y) {S, val S' ^ v{X Y)) 

The term Point represents an object with an attribute val and two 
methods get and set. The method get gives access to the attribute, while 
method set is used for modifying the attribute by adding the new value 
at the end of the object. In this context, it is easy to check that 
Point.getM>-v{l 1), and Point. set{v{2 2)) Point, {val — >■ 5" — >■ v{2 2)), and 
Point. set{v{2 2)).getM>- v{l l),v{2 2). Worthy of notice is that: 

1. The call Point. set{v{2 2)) produces a result which consists of the old Point 
and the new (modified) value for the attribute val, i.e. val — > S" — ^ v{2 2); 

2. The call Point. set{v{2 2)). get produces a structure composed of two 
elements, the former representing the value of val before the execution of 
set {i.e. before a side effect), and the latter one after the execution of set; 

3. A trivial strategy to recover determinism is to consider only the last value 
from the list of results, i.e. v{2 2). From this point of view, the pCal can be 
also understood as a formalism to study side effects in imperative calculi; 

4. A way to fix imperative features is to modify the encoding of the method 
set by considering the term killn = {X, n ^ Z,Y) ^ X, Y, and by defining 
the new object PoinUmp as 

val — ^ . . . , get — ^ . . . , set — >• S' — >■ v{X Y) — >• {kill.uai{S), val — >• S' — ^ v{X Y)) 

such that Pointimp.getM)- v{l 1), and Pointimp.set{v{2 2))i—» val S' 
v{2 2), get set and Pointimp-set{v{2 2)). get v{2 2); 

5. The moral of this example is that the encoding of objects into the pCal can 
strongly modify the behavior of a computation. 

In the next example we present the encoding of the Fisher, Honsell, and Mitchell 
fixed-point operator P! and its generalization in the pCal. 

Example 5 (A Fixed Point Object). Assume symbols rec and /. The fixed-point 
object Fixf for / can be represented in the pCal as Fix/ = rec — >■ S — )> f{S.rec). 
It is not hard to verify that Fixf.rece^ f{Fixf.rec). This fixed point can be 
generalized as Fix = rec — >■ S — >■ A — >• X{S.rec{X)) and its behavior will be 
Fix.rec{f) i— » f{Fix.rec{f)). 
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4.1 The Lambda Calculus of Objects 

We now present a translation of the Lambda Calculus of Objects of Fisher, 
Honsell, and Mitchell m into the pCal. This calculus is an untyped lambda 
calculus with constants enriched with object primitives. A new object can be 
created by modifying and/or extending an existing prototype object; the result 
is a new object which inherits all the methods and fields of the prototype. This 
calculus is trivially computationally complete, since the lambda calculus is built 
in the calculus itself. The syntax and the small-step semantics of XObj we present 
in this paper are inspired by the work of inj. 



Syntax and Operational Semantics. The syntax of the calculus is defined 
as follows: 

M,N ::= XX.M \ MN | A | c | 

0 I (M^n = N) I (M <+n = N)\M -^n\ Sel{M,m,N) 

Let be either ^ or ^ ; the small-step semantics is defined by 

(XX.M) N ^xobj [X/N]M Sel{{M^* n = N),n,P) ^xohj NP 

M m i-^xobj Sel{M,m, M) Sel{(M<^*n = N),m,P) i-^xobj Sel{M,m,P) 

The main operation on objects is method invocation, whose reduction is 
defined by the second rule. Sending a message m to an object M, containing 
a method m, reduces to Sel{M,m, M). More generally, in the expression 
Sel{M,m, N), the term N represents the receiver (or recipient) of the message, 
the constant m is the message we want to send to the receiver of the message, 
and the term M is (or reduces to) a proper sub-object of N. 

By looking at the last two rewrite rules, one may note that the Sel function 
“scans” the recipient of the message until it finds the definition of the method 
we want to use; when it finds the body of the method, it applies this body to the 
recipient of the message. The operational semantics in was based on a more 
elaborate bookkeeping relation which transforms the receiver {i.e. an ordered list 
of methods) into another equivalent object where the method we are calling is 
always the last overridden one. 

As a simple example of the calculus, we show an object which has the 
capability to extend itself simply by receiving a message which encodes the 
method to be added. 

Example 6 (An object with “self-extension”). Consider the object Self-ext [1 7) 
(0 H- add-n = XS.{S n = XS'.l)). If we send the message add_n to 
Self-Cxt, then we get Self-ext <J= add.n i-^xobj Sel{Self-ext, add_n, Self.ext) <-^\obj 
{XS.{S H- n = XS' .1)) Self -ext >-^\obj {Self-ext H- n = XS'.l), resulting in the 
method n being added to Self-Cxt. 
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The Translation of XObj into pCal. The translation of a AO&^term into a 
corresponding pCal-teiTu is quite trivial and can be done in the theory T\obj 
where the symbol is associative and null is its neutral element. Intuitively, 
an object in XObj is translated into a simple structure in pCal. The choice we 
made for object override is an imperative one, i.e. we delete the method we 
are overriding using the kill function defined in Example ^ The translation is 
defined as follows: 



[cl 


A 


c 


[01 


A 


null 


[^I 


A 


X 


1{M ^n = A)I 


A 


kiiP{[M]),n ^ INj 


IXX.Mj 


A 


X^lMj 


[(M ep n = N)j 


A 


[A] 


[MNj 


A 


|M]iA] 


{M <= m] 


A 


[M].m 0 








lSel{M,m,N)j 


A 





For instance. Example Q shows an example of a simple computation in XObj 
and the corresponding translation into the pCal, and Example 0 presents the 
translation of the Self.ext object into the pCal. 

Example 1 (A Simple Computation). Let Point be the simple diagonal point 
{{{) H- X = XS.S <^= y) H- y = AS”.!). Then Point <^= x >-^\obj 
Sel{Point,x, Point) ^\obj Sel{{{) <r+ x = XS.S y),x, Point) ^\obj 
{XS.S <= y)Point <-^\obj Point y <-^\obj S el {Point, y, Point) <-^xobj 
{XS.l) Point t-^\obj 1- 

The above computation in XObj can be easily translated into a corresponding 
computation in pCal using t = |Pom<] = x — >■ S' — )> S.y,y — )> S — >■ 1 as follows: 
{Point <J= x] = t.x = (x — >■ S — >■ S.y,y — >■ S — >■ l)*x*tH»(S — >■ S.y,null)’t =Txc»j, 
{S — >■ S.y)’t H> t.y = (x — >■ S — >■ S.y, y ^ S ^ l)’yt^{null, S — >■ l)*t 
{S — >■ l)*t I— >■ 1. 

Example 8 (Translation of Self .ext). The object S elf .ext can be easily translated 
in the pCal as ti = |SeZ/_ext] = add_n — >■ S — >■ (S, n — >• S' — >■ 1). Then: 
{t\.addjn).n i— >■ ((S — >■ (S,n — >■ S' — >■ l))*ti).n i— >■ {t\,n — >■ S' — >■ l).n = 
{add_n n —)> S' —)> 1) .n null, (S' — >■ l)*t 2 =T>,a, {S' — t l)’t 2 1- 

t2 

The translation into the pCal can be proved correct when the theory is 

considered: 

Theorem 2 ('Translation of XObj into pCal). If M >-^\obj N, then 

IMj 

4.2 The Object Calculus 

The Object Calculus [Q is a calculus where the only existing entities are 
the objects; it is computationally complete since A-calculus, fixed points and 
complex structures can be easily encoded within it. A large collection of variants 
(functional and imperative, typed and untyped) for this calculus are presented 
in the book and in the literature. 
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Syntax and Operational Semantics. The syntax of the object calculus is 
defined as follows: 

a,b ::= X \ [rrii = \ a.m \ a.m ~ <;{X)b 

Let a = [rrii = the small-step semantics is: 

a.rrij >-<r 06 j [X/a\bj j = 1 .. .n 

a.rrij := <;{X)b i-^^obj [rrii = c;{X)bi,mj = j = 1 . . . n 



The Translation into pCal. The translation of an cO&j-term into a 
corresponding pCal-ierai is quite similar to the one of XObj, and can be done in 
the theory T^Q}yj where the symbol is associative and commutative, and null 
is its neutral element. Given the function kilim = {X,m Y) ^ X , and the 
alias {ti.m := ^ 2 ) = {killm{ti) , m — > ^ 2 ), the translation is defined as follows: 

IXj = X [ [rui = ^{X)biY=^-^ ] = {nii^X^ 

la.mj] = H-rUj fa.m := ^{X)bj = |a].m ;= X [fc] 

As a simple example, we present the usual Abadi and Cardelli’s encoding of 
the Point class p. 

Example 9 (A Point Class). The object PClass is defined in <;Obj as follows: 
[new = q{S)o,val = XS' . v(l l),get = AS'. {S' .val),set = XS'.XN. S' .val := N] 

with o = [val = <;{S'){S.val){S'), get = ^{S'){S.get){S'), set = c(S')(S.set)(S')], and 
it is translated into the pCal as follows: 

new S ^ t, val S ^ S' ^ u(l 1), 

get^ S^ S' ^ S'. val, set ^ S ^ S' ^ v{X Y) {S' .val := S" v{X Y)) 
with t = {val — ^ S' — ^ {S.vaiyS' ,get — >■ S' — ^ {S.get)’S' ,set — >• S' — ^ {S. set)’ S') 



It is not hard to verify that lPClassl.new^-^■f^^. Pointimp- 

As another example, we present the Abadi and Cardelli’s fixed point object 
operator. To do this we recall the usual encoding of lambda calculus in gObj: 



IS] ^ S 
IMNj ^ [M]o[iV] 



|AS.M| = [arg — <;{S)S.arg,val = <;{S)[S.arg/S[lMj] 



and the alias po q ^ {p.arg := g{S)q).val, which represents the encoding of the 
function application. 

Example 10 (Another Eixed-Point Object). In gObj, the generic fixed-point 
object Fix = [arg = g{S)S.arg,val = g{S){{S.arg).arg := g{S')S.val).val], can 
be translated into pCal as: 

Fix = arg — >• S — ^ S.arg, val — ^ S — >■ {kill arg {S. arg), arg — ^ S' — ^ S.val).val 

Using the aliases t\ o t 2 = {t\.arg := S — >■ t 2 ).val, and Fixf = Fix. arg := 
S — >■ /, we can prove that Fix o f = Fix f .val {{Fix f .arg). arg := S' — >■ 

Fix f. val). val e^{f.arg := S' — >■ Fix o f).val = / o {Fix o /). 
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The translation into the pCal can be proved correct when the theory Tt^obj is 
considered: 

Theorem 3 (Translation of <^Obj into pCal). If M ^^obj N, then 

m 

The following example shows that the expressivity of pCal is strictly stronger 
than the two previous calculi of objects as they cannot be translated neither 
in XObj nor in <,Obj. In fact, we can easily consider “labels” and “bodies” as 
first-class entities that can be passed as function arguments. 

Example 11 (The Daemon and the Para object). 

1. Assume Para be a — >■ S' — >■ b,par{X) — >• S — >■ S.X. This object has 
a method par{X) which seeks for a method name that is assigned to 
the variable X and then sends this method to the object itself. Then: 
Para.{par{a)) = Para'{par{a))' Para^-^{S — >■ S.af'Para i— >■ Para.a^b. 

2. Assume Daemon be set — >■ S — >■ A — >• {X,set — >■ S' — >■ T — >■ (F, S')). 

The set method of Daemon is used to create an object completely from 
scratch by receiving from outside all the components of a method, namely, 
the labels and the bodies. Once the object is installed, it has the capability 
to extend itself upon the reception of the same message set. In some sense 
the “power” of Daemon has been inherited by the created object. Then: 
Daemon. set{x — >■ S — >■ 3) = DaemomseP Daemon’ {x — >■ S — >■ 3)i— »(A — >■ 
{X,set S' ^ Y ^ (Y, S')))’{x — >-S— >-3)i— >-S— >■ 3, set — >■ S' — >■ 
Y — >■ (Y, S') = t, and t.set{y — >■ S — >■ 4) y — >■ S — >■ 4, t. 

One may wonder if some reductions in the pCal can be translated into a 
suitable computation either in the XObj calculus or in the <;Obj calculus: we can 
distinguish the following cases: 

— reductions a la lambda calculus, like {X — >■ t\)’t 2 >— >■ [X/t 2 ]t\, i.e. with trivial 
matching, are directly translated in either XObj and <;Obj; 

— reductions which use non-trivial matching, like (a — >■ b)’c i— >■ null, can 
be translated in either XObj and ^Obj, modulo an encoding of the 
underlined matching theory (taking into account also possible failures), as 
in Sol{a <Ct c) = null; 

— reductions which use structures, like {X a, X — >• b)’c !->■ (a, 6), can be 
translated in either XObj and ^Obj, modulo a non trivial encoding of the 
nondeterministic features intrinsic to the pCal. 

Therefore, when using more elaborate theories and structures, the encoding 
becomes at least as difficult as the matching problem underlining the theory 
itself. 

5 Conclusions and Further Work 

We have presented a new version of the Rho Calculus and shown that its 
embedded matching power permits us to uniformly and naturally encode various 
calculi including the Lambda Calculus of Objects and the Object Calculus. 
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This presentation of the Rho Calculus inherits from the ideas and concepts 
of the first proposed one m. it simplifies the rules of the calculus and improves 
the way the results are handled. This allows us first to encode object oriented 
calculi in a very natural and simple way but further to design new powerful 
object oriented features, like parameterized methods or self creating objects. 
Based on this new generic approach, an implementation of objects is under way 
in the Rho-based language ELAN m- More generally, rewrite based languages 
like ASF+SDF, CafeOBJ, Maude, Stratego, or ELAN, could benefit from a 
Rho-based semantics that gives a first-class status to rewrite rules and to their 
application. 

We are now planning to work on several directions. First, on giving a big-step 
semantics in order to define a deterministic evaluation strategy, when needed. 
Then, the calculus could be further generalized by the explicit use of constraints. 
For the moment, the (p) rule calls for the solutions set of the relevant matching 
constraint; this could be replaced by an appropriate constrained term, in the 
spirit of constraint programming. We are also exploring an elaborated type 
system allowing in particular to type self-applications. As we have seen, the 
are many possible applications of the framework; a track that we have not yet 
mention in this paper concerns encoding concurrency in the spirit of the early 
work of Viry EH- 

Independently of these ongoing works, we believe that the matching power 
of the Rho Calculus could be widely used, thanks to its expressiveness and 
simplicity, as a new model of computation. 

Acknowledgement. We thank the referees for their constructive remarks, 
Hubert Dubois and all the members of the ELAN group for their comments 
and interactions on the topics of the Rho Calculus. 
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Abstract. The dependency pair technique of Arts and Giesl [1,2,3] for 
termination proofs of term rewrite systems (TRSs) is extended to rewrit- 
ing modulo equations. Up to now, such an extension was only known in 
the special case of A(7-rewriting [15,17]. In contrast to that, the proposed 
technique works for arbitrary non-collapsing equations (satisfying a cer- 
tain linearity condition). With the proposed approach, it is now possible 
to perform automated termination proofs for many systems where this 
was not possible before. In other words, the power of dependency pairs 
can now also be used for rewriting modulo equations. 



1 Introduction 

Termination of term rewriting (e.g., [1,2,3,9,22]) and termination of rewriting 
modulo associativity and commutativity equations (e.g., [8,13,14,20,21]) have 
been extensively studied. For equations other than AC-axioms, however, there 
are only a few techniques available to prove termination (e.g., [6,10,16,18]). 

This paper presents an extension of the dependency pair approach [1,2,3] to 
rewriting modulo equations. In the special case of AC-axioms, our technique 
corresponds to the methods of [15,17], but in contrast to these methods, our 
technique can also be used if the equations are not AC-axioms. This allows much 
more automated termination proofs for equational rewrite systems than those 
possible with directly applying simplification orderings for equational rewriting 
(like equational polynomial orderings or AC-versions of path orderings). 

We first review dependency pairs for ordinary term rewriting in Sect. 2. 
In Sect. 3, we show why a straightforward extension of dependency pairs to 
rewriting modulo equations is not possible. Therefore, we follow an idea similar 
to the one of [17] for AC-axioms: We consider a restricted form of equational 
rewriting, which is more suitable for termination proofs with dependency pairs. 

In Sect. 4, we show how to ensure that termination of this restricted equa- 
tional rewrite relation is equivalent to termination of full rewriting modulo equa- 
tions. Under certain conditions on the equations £, we show how to compute an 
extended rewrite system Exts {TV) from the given TRS TZ such that the restricted 
rewrite relation of Ext£{TZ) modulo £ is terminating iff TZ is terminating modulo 
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£. This is proved for (almost) arbitrary £i-rewriting, thus generalizing a related 
result for AC-rewriting. This general result may be of independent interest, and 
may also be useful in investigating other properties of ip-rewriting. Finally, in 
Sect. 5, we extend the dependency pair approach to rewriting modulo equations. 



2 Dependency Pairs for Ordinary Rewriting 

The dependency pair approach allows the use of standard methods like simpli- 
fication orderings [9,22] for automated termination proofs where they were not 
applicable before. In this section we briefly summarize the basic concepts of this 
approach. All results in this section are due to Arts and Giesl and we refer to 
[1,2,3] for further details, refinements, and explanations. 

In contrast to the standard techniques for termination proofs, which com- 
pare left and right-hand sides of rules, in this approach one concentrates on the 
subterms in the right-hand sides that have a defined^ root symbol, because these 
are the only terms responsible for starting new reductions. 

More precisely, for every rule /(si, . . . , s„) — >■ C[g{ti, . . . , tm)] (where / and g 
are defined symbols), we compare the argument tuples si, . . . , s„ and ti, . . . , tm- 
To avoid the handling of tuples, for every defined symbol /, we introduce a 
fresh tuple symbol F. To ease readability, we assume that the original signature 
consists of lower case function symbols only, whereas the tuple symbols are 
denoted by the corresponding upper case symbols. Now instead of the tuples 
si, . . . , and t\, . ■ ■ ,tm we compare the terms F{s\, . . . , s„) and G{ti , . . . , tm)- 

Definition 1 (Dependency Pair [1,2,3]). If f{si,. . . ,Sn)^C[g{ti,. . . ,tm)] is 
a rule of a TRS TZ and g is a defined symbol, then (^( 51 , ■ . ■ , s„), G(ti, . . . ,tm)) 
is a dependency pair of TZ. 

Example 2. As an example, consider the TRS {a -|- b — >■ a -|- (b -|- c)}, cf. [17]. 
Termination of this system cannot be shown by simplification orderings, since the 
left-hand side of the rule is embedded in the right-hand side. In this system, the 
defined symbol is -I- and thus, we obtain the dependency pairs (P(a, b), P(a, b-|-c)) 
and (P(a, b), P(b,c)) (where P is the tuple symbol for the plus-function “-I-”). 

Arts and Giesl developed the following new termination criterion. As usual, 
a quasi-ordering ^ is a reflexive and transitive relation, and we say that an 
ordering > is compatible with F if we have >o^C>or^o>C>. 

Theorem 3 (Termination with Dependency Pairs [1,2,3]). A TRS TZ is 

terminating iff there exists a weakly monotonic quasi- ordering F and a well- 
founded ordering > compatible with where both ^ and > are closed under 
substitution, such that 

(1) s > t for all dependency pairs (s, t) of TZ and 

(2) IFr for all rules I ^ r ofTZ. 



^ Root symbols of left-hand sides are defined and all other functions are constructors. 
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Consider the TRS from Ex. 2 again. In order to prove its termination ac- 
cording to Thm. 3, we have to find a suitable quasi-ordering ^ and ordering > 
such that P(a, b) > P(a, b 4 - c), P(a, b) > P(b,c), and a -|- b ^ a -|- (b + c). 

Most standard orderings amenable to automation are strongly monotonic 
(cf. e.g. [9,22]), whereas here we only need weak monotonicity. Hence, before 
synthesizing a suitable ordering, some of the arguments of function symbols may 
be eliminated, cf. [3]. For example, in our inequalities, one may eliminate the 
first argument of -P. Then every term s-l-t in the inequalities is replaced by -l-^(t) 
(where +' is a new unary function symbol). By comparing the terms resulting 
from this replacement instead of the original terms, we can take advantage of 
the fact that + does not have to be strongly monotonic in its first argument. 
Note that there are only finitely many possibilities to eliminate arguments of 
function symbols. Therefore all these possibilities can be checked automatically. 

In this way, we obtain the inequalities P(a, b) > P(a, -P'(c)), P(a, b) > P(b, c), 
and -P'(b) ^ -P'(-|-'(c)). These inequalities are satisfied by the recursive path 
ordering (rpo) [9] with the precedence a Zl b Zl c Zl -P' (i.e., we choose ^ to 
be ^rpo and > to be >~rpo)- So termination of this TRS can now be proved 
automatically. For implementations of the dependency pair approach see [4,7]. 

3 Rewriting Modulo Equations 

For a set £ of equations between terms, we write s t if there exist an 
equation / « r in f, a substitution cr, and a context C such that s = C[l(j] and 
t = C[ra\. The symmetric closure of — >■£ is denoted by PHf: and the transitive 
reflexive closure of PHf: is denoted by . In the following, we restrict ourselves 
to equations £ where is decidable. 

Definition 4 (Rewriting Modulo Equations). Let TZ be a TRS and let £ he 

a set of equations. A term s rewrites to a term t modulo £, denoted s ~^n/s t, 
iff there exist terms s' and t' such that s s' — >- 7 ^ t' t. The TRS TZ is called 

terminating modulo £ iff there does not exist an infinite -^u/e reduction. 

Example 5. An interesting special case are equations £ which state that certain 
function symbols are associative and commutative (AC). As an example, con- 
sider the TRS TZ — {a-Pb — >• a-p(b-Pc)} again and let £ consist of the associativity 
and commutativity axioms for -P, i.e., £ = {xi + X 2 ^ X 2 + x\, X\ + (x 2 + X 3 ) « 
(xi -P X 2 ) + ccs}, cf. [17]. TZ is not terminating modulo £, since we have 

a-Pb — a-p(b-pc) (a-pb)-Pc — y-jz. (a-p(b-pc))-pc ((a-pb)-Pc)-Pc — y-fi . . . 

There are, however, many other sets of equations £ apart from associativity 
and commutativity, which are also important in practice, cf. [11]. Hence, our aim 
is to extend dependency pairs to rewriting modulo (almost) arbitrary equations. 

The soundness of dependency pairs for ordinary rewriting relies on the fact 
that whenever a term starts an infinite reduction, then one can also construct 
an infinite reduction where only terminating or minimal non-terminating sub- 
terms are reduced (i.e., one only applies rules to redexes without proper non- 
terminating subterms). The contexts of minimal non-terminating redexes can 
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be completely disregarded. If a rule is applied at the root position of a minimal 
non-terminating subterm s (i.e., s — t where e denotes the root position), 
then s and each minimal non-terminating subterm t' of t correspond to a depen- 
dency pair. Hence, Thm. 3 (1) implies s > t'. If a rule is applied at a non-root 
position of a minimal non-terminating subterm s (i.e., s t), then we have 
s ^ t by Thm. 3 (2). However, due to the minimality of s, after finitely many 
such non-root rewrite steps, a rule must be applied at the root position of the 
minimal non-terminating term. Thus, every infinite reduction of minimal non- 
terminating subterms corresponds to an infinite >-sequence. This contradicts 
the well-foundedness of >. 

So for ordinary rewriting, any infinite reduction from a minimal non-termi- 
nating subterm involves an 7?.-reduction at the root position. But as observed in 
[15], when extending the dependency pair approach to rewriting modulo equa- 
tions, this is no longer true. For an illustration, consider Ex. 5 again, where 
a -|- (b -|- c) is a minimal non-terminating term. However, in its infinite 'R-jS- 
reduction no 7^-step is ever applicable at the root position. (Instead one applies 
an f-step at the root position and further TZ- and f-steps below the root.) 

In the rest of the paper, from a rewrite system TZ, we generate a new rewrite 
system TZ' with the following three properties: (i) the termination of a weaker 
form of rewriting by TZ' modulo £ is equivalent to the termination of TZ modulo 
£, (ii) every infinite reduction of a minimal non-terminating term in this weaker 
form of rewriting by TZ' modulo £ involves a reduction step at the root level, and 
(iii) every such minimal non-terminating term has an infinite reduction where 
the variables of the T^'-rules are instantiated with terminating terms only. 



4 ^-Extended Rewriting 

We showed why the dependency pair approach cannot be extended to rewriting 
modulo equations directly. As a solution for this problem, we propose to consider 
a restricted form of rewriting modulo equations, i.e., the so-called £-extended TZ- 
rewrite relation (This approach was already taken in [17] for rewriting 

modulo AC.) The relation -^syjz was originally introduced in [19] in order to cir- 
cumvent the problems with infinite or impractically large .^-equivalence classes.^ 

Definition 6 (£l-extended 77.-rewriting [19]). Let TZ be a TRS and let £ he 

a set of equations. The £-extended TZ-rewrite relation is defined as s t iff 

sJtt Icr and t = s[rcr],r for some rule I ^ r in TZ, some position it of s, and 
some substitution a. We also write -^s\n instead 

To demonstrate the difference between and consider Ex. 5 

again. We have already seen that is not terminating, since a -|- b 

(a + b) -b c ((^ + b) -f c) -b c -^n/e ■ ■ ■ But -^£\n is terminating, because 

a -b b -^£\'iz a -b (b -b c), which is a normal form w.r.t. -^£\tz- 

In [12], the relation — is denoted 



2 
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The above example also demonstrates that in general, termination of 
is not sufficient for termination of In this section we will show how ter- 

mination of ~^Tz/s can nevertheless be ensured by only regarding an £l-extended 
rewrite relation induced by a larger TV D TZ. 

For the special case of yfC-rewriting, this problem can be solved by extending 
TZ as follows: Let G be the set of all AC-symbols and 

ExtACiG) =TliJ {/(/, y) f{r, y) \ I ^ r G TZ, root{l) = f gG}, 

where y is a new variable not occurring in the respective rule / — >■ r. A similar 
extension has also been used in previous work on extending dependency pairs 
to AC-rewriting [17]. The reason is that for AC-equations E, the termination of 
~^TZie is in fact equivalent to the termination of -^s\ExtAc(g)(T^)’ 

For Ex. 5, we obtain ExtAC(G){'E) = {a -|- b — >■ a -|- (b -|- c), (a -|- b) -|- y — >• 
(a + (b+c))-|-y}. Thus, in order to prove termination of it is now sufficient 

to verify termination of -^£\ExtAcw)C^)- 

The above extension of [19] only works for AC-axioms £. A later paper [12] 
treats arbitrary equations, but it does not contain any definition for extensions 
Exts{TZ), and termination of is always a prerequisite in [12]. The reason 

is that [12] and also subsequent work on symmetrization and coherence were 
devoted to the development of completion algorithms (i.e., here the goal was 
to generate a convergent rewrite system and not to investigate the termination 
behavior of possibly non-terminating TRSs). Thus, these papers did not compare 
the termination behavior of full rewriting modulo equations with the termination 
of restricted versions of rewriting modulo equations. In fact, [12] focuses on the 
notion of coherence, which is not suitable for our purpose since coherence of E\TZ 
modulo E does not imply that termination of -G-ji/g is equivalent to termination 

of 

To extend dependency pairs to rewriting modulo non-AC-equations E, we 
have to compute extensions Exts{TZ) such that termination of -^tz/£ is equiv- 
alent to termination of -^£\Exte{TZ)- The only restriction we will impose on the 
equations in E is that they must have identical unique variables. This require- 
ment is satisfied by most practical examples where TZj E is terminating. As usual, 
a term t is called linear if no variable occurs more than once in t. 

Definition 7 (Equations with Identical Unique Variables [19]). An equa- 
tion u fv V is said to have identical unique variables if u and v are both linear 
and the variables in u are the same as the variables in v. 

Let uni£{s,t) denote a complete set of £l-unifiers of two terms s and t. As 
usual, 5 is an £ -unifier of s and t iff sS t6 and a set uni£{s, f) of £i-unifiers is 
complete iff for every £i-unifier S there exists a, a G uni£{s,t) and a substitution 

® In [12], £\IZ is coherent modulo £ iff for all terms s, t, u, we have that s t — u 
implies s v w u for some v,w. Consider 77. = {a-|-b— >-a + (b-|- 

c), * -I- y — >■ d} with £ being the AC-axioms for -I-. The above system is coherent, 
since s t — >£^ 7 ^ u implies s — d u. However, — is terminating but 
~^'Jz/£ is not terminating. 
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p such that 5 ap, cf. [5]. (“crp” is the composition of cr and p where a is 
applied first and “6 crp” means that for all variables x we have xS xap.) 

To construct Exts{TZ), we consider all overlaps between equations m « u or 
V ~ u from £ and rules I — >■ r from TZ. More precisely, we check whether a non- 
variable subterm v\t^ of v f-unifies with I (where we always assume that rules 
in TZ are variable disjoint from equations in £). In this case one adds the rules 
{v\V\T^)a — >■ {v[r]Tr)(T for all cr G uni£{v\Tr,l).'^ In Ex. 5, the subterm xi + X2 of 
the right-hand side of x\ + {x2 + x^) « {x\ + X2) + x^ unifies with the left-hand 
side of the only rule a-l-b— ^a-l-(b-l-c). Thus, in the extension of TZ, we obtain 
the rule (a -|- b) -|- y — >■ (a -|- (b -|- c)) -|- y. 

ExtgiTZ) is built via a kind of fixpoint construction, i.e., we also have to 
consider overlaps between equations of £ and the newly constructed rules of 
Exts{TZ). For example, the subterm xi + X2 also unifies with the left-hand side 
of the new rule (a -|- b) -|- j/ — >■ (a -|- (b 4- c)) -I- y. Thus, one would now construct 
a new rule ((a -|- b) -|- y) -I- 2: — >■ ((a -I- (b 4- c)) + y) + z. 

Obviously, in this way one obtains an infinite number of rules by subsequently 
overlapping equations with the newly constructed rules. However, in order to 
use ExtsiTZ) for automated termination proofs, our aim is to restrict ourselves 
to finitely many rules. It turns out that we do not have to include new rules 
{v[l]^)a -)> {v[r]^)a in Exte{TZ) if ua -^s\Exts{n) ^ already holds 

for some position tt' of u and some term q (using just the old rules of Ext£{TZ)). 

When constructing the rule ((a 4- b) 4- y) 4- z — >■ ((a 4- (b 4- c)) 4- y) 4- z above, 
the equation u « u used was x\ 4- {x2 + X3) « {x\ 4- X2) + X3 and the unifier cr 
replaced X\ by (a4-b) and X2 by y. Hence, here ua is the term (a 4- b) 4- (y-l-Xs). 
But this term reduces with -^\\Exte{'R.) (^ + (b + c)) 4- (y-l-Xs) which is indeed 

-equivalent to {v[r]^)a, i.e., to ((a 4- (b 4- c)) 4- y) 4- X3. Thus, we do not have 
to include the rule ((a 4- b) 4- y) 4- z — >■ ((a 4- (b 4- c)) 4- y) 4- z in Ext£{TZ). 

The following definition shows how suitable extensions can be computed for 
arbitrary equations with identical unique variables. It will turn out that with 
these extensions one can indeed simulate ~^tz/£ by ~^£\Exte(TZ)j be., s -^TzjS t 
implies s ~^£\Exte(TZ) for some t' t. This constitutes a crucial contribu- 
tion of the paper, since it is the main requirement needed in order to extend 
dependency pairs to rewriting modulo equations. 

Definition 8 (Extending TZ for Arbitrary Equations). Let TZ he a TRS 

and let £ he a set of equations. Let TZ' he a set containing only rules of the form 
C[la] — >■ C[ra] (where C is a context, a is a substitution, and I ^ r G TZ). TZ' 
is an extension of TZ for the equations £ iff 

(a) TZ <GTZ' and 

Obviously, unie{v\.„., 1) always exists, but it can be infinite in general. So when au- 
tomating our approach for equational termination proofs, we have to restrict our- 
selves to equations £ where Mm£(u|,r,0 can be chosen to be finite for all subterms 
u|,r of equations and left-hand sides of rules 1 . This includes all sets £ of finitary uni- 
fication type, but our restriction is weaker, since we only need finiteness for certain 
terms v\t, and 1. 
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(b) for all I ^ r G TV , u fv v € S and v fv u G £, all positions n of v 

and a G |^, /), there is a position tt' in u and a q (f[r]^)cr with 

^e\w 9 - 

In the following, let Ext£{TV) always denote an arbitrary extension of TZ for S. 

In order to satisfy Condition (b) of Def. 8, it is always sufficient to add the rule 
{v[l]Tr)cr — >■ (f to TV . The reason is that then we have ua — >-£^7^/ (f [''’]7r)o'. 

But if ua q {v[r]Tr)a already holds with the other rules of TV , then 

the rule (r;[?]^)(j — >■ (f [r]^)a does not have to be added to TV . 

Condition (b) of Def. 8 also makes sure that as long as the equations 
have identical unique variables, we do not have to consider overlaps at vari- 
able positions.^ The reason is that if v\t^ is a variable x G V, then we have 

ua = u[xa]Tri u[la].„i u[ra]T^' v[ra]T^ = {v[r]Tf)a, where tt' is the posi- 
tion of X in u. Hence, such rules {v[l]Tf)a — >■ (w[r],r)cr do not have to be included 
in TV . 

Overlaps at root positions do not have to be considered either. To see this, 
assume that tt is the top position e of w, i.e., that va la. In this case we have 
ua va la ra and thus, ua ra = {v[r]Tf)a. So again, such rules 

(v^Ti-) — >■ {'v[r]n)a do not have to be included in TZ'. 

The following procedure is used to compute extensions. Here, we assume both 
TZ and £ to be finite, where the equations £ must have identical unique variables. 

1. TZ' := TZ 

2. For alH — >■ r G TV , 

all M « u or w « u from £, 

and all positions tt of v where tt yf e and f ^ V do: 

2.1. Let S := uni£{v\T^,l)- 

2.2. For all cr G if do: 

2.2.1. Let T := {q \ ua q for a position tt' of u}. 

2.2.2. If there exists a g G T with {v[r]T^)a q, then E := E\ {a}. 

2.3. TV := TV U {{v[l]T^)a — >■ {v[r]Tf)a \ a G E}. 

This algorithm has the following properties: 

(a) If in Step 2.1, uni£{v\Tr,l) is finite and computable, then every step in the 
algorithm is computable. 

(b) If the algorithm terminates, then the final value of TV is an extension of TZ 
for the equations £. 

With the TRS of Ex. 5, Ext£{TZ) = {a -f b — > a -f (b -f c), (a -b b) -|- j/ — >• 
(a -b (b -b c)) -b y}. In general, if £ only consists of HC-axioms for some function 
symbols G, then Def. 8 “coincides” with the well-known extension for HC-axioms, 
i.e., TZ' = TZU {f{l,y) — >■ f{r,y) \ l ^ r G TZ, root{l) = f £ Q} satisfies the 

® Note that considering overlaps at variable positions as well would still not allow us 
to treat equations with non-linear terms. As an example regard £ — {f (a;) « g(x, a:)} 
and TZ = {g(a, b) — >• f(a),a — >• b}. Here, -^s\Ext£(Ti) is well founded although TZ is 
not terminating modulo £. 
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conditions (a) and (b) of Def. 8. So in case of AC-equations, our approach indeed 
corresponds to the approaches of [15,17]. However, Def. 8 can also be used for 
other forms of equations. 

Example 9. As an example, consider the following system from [18]. 

TZ = { x — O^x, S = {{u^v)^w « {u^w)^v} 

s{x) - s{y) x-y, 

O-hs(y) 0, 

s(x) ^ s(y) s((x - y) d- s(y))} 

By overlapping the subterm u w in the right-hand side of the equation with 
the left-hand sides of the last two rules we obtain 

Exts{TZ) =1^,19 { (0 -h sly)) z 0 -h z, 

(s(x) ^ s(y)) -^z^ s((x - y) d- s(y)) z }. 

Note that these are indeed all the rules of Ext£{TZ). Overlapping the sub- 
term u V of the equation’s left-hand side with the third rule would result in 
(0 s(j/)) z' — >■ 0 ^ z' . But this new rule does not have to be included in 

Exts{TZ), since the corresponding other term of the equation, (0 ^ z') ^ s(y), 
would — >-|\£;a;t^( 7 ^)-reduce with the rule (0 s(y)) -^z^O^z to O^z'. Over- 

lapping u-^ V with the left-hand side of the fourth rule is also superfluous. 

Similarly, overlaps with the new rules (0 s(y)) -^2 — >■ 0 2 or (s(x) 

s{y)) ^ z — >■ s{{x — y) ^ s(y)) ^ 2 ; also do not give rise to additional rules in 
Exts (77.) . To see this, overlap the subterm u w in the right-hand side of the 
equation with the left-hand side of (0 ^ s(y)) z — >• 0 ^ z. This gives the rule 
((0 -y s(y)) ^ z) ^ z' — >■ (0 z) ^ z'. However, the corresponding other term of 
the equation is ((0 s(y)) ^ z') ^ z. This reduces at position 1 (or position 11) 

to (0 ^ z') z, which is f-equivalent to (0 ^ z) z'. Overlaps with the other new 
rule (s(a;) ^ s(j/)) ^ z — >■ s((a; — y) ^ s(y)) ^ z are not needed either. 

Nevertheless, the above algorithm for computing extensions does not always 
terminate. For example, for 77 = {a{x) — >■ c(x)}, £ = {a(b(a(x))) « b(a(b(x)))}, 
it can be shown that all extensions Exts{TZ) are infinite. 

We prove below that Exts{'R,) (according to Def. 8) has the desired property 
needed to reduce rewriting modulo equations to f-extended rewriting. The fol- 
lowing important lemma states that whenever s rewrites to t with modulo 

£, then s also rewrites with -^£\Exts(n) to a term which is .^-equivalent to t.® 

Lemma 10 (Connection between -^-jz/e -^e\Exte(TZ))- LetTZ he a TRS 
and let £ he a set of equations with identical unique variables. If s -^Tzje L then 
there exists a term t' t such that s -^£\Exte(TZ) t' ■ 

® Our extension Exts has some similarities to the construction of contexts in [23]. 
However, in contrast to [23] we also consider the rules of 7Z' in Condition (b) of Def. 
8 in order to reduce the number of rules in Exts. Moreover, in [23] equations may 
also be non-linear (and thus. Lemma 10 does not hold there). 
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Proof. Let s -^n/S i-e-, there exist terms sq, • . . , s„,p with n > 0 such that 

s = Sn Hf: s„_i \-~\g ■ ■ ■ Hf: sq p t. For the lemma, it suffices to show 
that there is a t' p such that s ~^e\Exte(TZ) since t' P implies t' t. 

We perform induction on n. If n = 0, we have s = s„ = Sq p. This 
implies s -^g\Exte(TZ) P since TZ C Extg{TZ). So with t' = p the claim is proved. 

If n > 0, the induction hypothesis implies s = Sn P\g s„_i -^g\Exts(TZ) t' 
such that t' p. So there exists an equation rt « u or u « m from £ and a 
rule I — >■ r from Extg{TZ) such that s|t = u5, s„_i = s[ui5]t, Sn-i|^ W, and 
t' = s„_i[r(5]{ for positions r and ^ and a substitution 5. We can use the same 
substitution 6 for instantiating the equation u ^ v (or v ^ u) and the rule I — >■ r, 
since equations and rules are assumed variable disjoint. We now perform a case 
analysis depending on the relationship of the positions r and f. 

Case 1: r = ^7T for some tt. In this case, we have s|j = s|{[M<5],r s|{[u5],r = 

Sn-i|{ IS. This implies s -^g\Ext£{n) = t' , as desired. 

Case 2: rEf. Now we have s|^ = IS and thus, s -^g\Exte{TZ) = 

s[ri5]^[iti5]r Hg = s[v<5]i-[r(5]5 = s„_i[r(5]5 = t' . 

Case 3: ^ = TTT for some tt. Thus, (u(5)|7r IS. We distinguish two sub-cases. 

Case 3.1: uS ~^g\Exte(TZ) 9 ('i’[’"]7r)<5 for some term q. This implies s = s[ui5]t- 

-^S\Ext£{n) s[g]r s[v[r]^S]r = (s[u(I]^) [r(5]j = = t' . 

Case 3.2: Otherwise. First assume that tt = where u|.n.j is a variable x. 
Hence, = S{x)\t^.^. Let S'{y) = S{y) for y ^ x and let S'{x) = i5(x)[r(5],r2- 

Since u ~ v (or u « u) is an equation with identical unique variables, x also 
occurs in u at some position tt'. This implies = S{x)\tt 2 IS -^Exts{TZ) 

rS. Hence, we obtain uS ~^g\Exte(n) '^S[rS]T^iT ^2 = '^S' ^g vS' = (u[r],r)<5 in 
contradiction to the condition of Case 3.2. 

Hence, tt is a position of v and v\t^ is not a variable. Thus, (u(5)|,r = IS. 

Since rules and equations are assumed variable disjoint, the subterm u|,r £l-unifies 
with 1. Thus, there exists a ct G unig{v\Tr, 1) such that S ^g ap. 

Due to the Condition (b) of Def. 8, there is a term q' such that ua ~^g\Ext£{n) 
q' {v[r]Tf)(T. Since tt' is aposition in m, we have rt|,n.'cr ^g o -^Extsiv.) where 
q' = ua[q"]Tr'. This also implies u\tt'S ^g u\t^i<tp ^g o ~^Ext£{TZ) q"Pj and thus 
~^e\Exte(TZ) uS[q"p].^> ua[q'%'p = q'p ^g {v[r]T,)ap ^g (v[r]^)<5. This is a 
contradiction to the condition of Case 3.2. □ 

The following theorem shows that Extg indeed has the desired property. 

Theorem 11 (Termination of by £i-Extended Rewriting). Let TZ he 

a TRS, let £ he a set of equations with identical unique variables, and let t be 
a term. Then t does not start an infinite -^■R/s-f'sduction iff t does not start 
an infinite -^g\Ext£{n)~^sduction. So in particular, TZ is terminating modulo £ 
(i.e., -^n/g is well founded) iff ^g\Exte (TZ) well founded. 

Proof. The “only if” direction is straightforward because -^Exts(n)=^n and 
therefore, ~^g\Exts(n) ^^Exte(n)/s = ~^n/s- 
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For the “if” direction, assume that t starts an infinite — ^-T^/^-reduction 
t = to -^nje ti -^nje ^2 ~^n/e ■ ■ ■ 

For every i G IN, let /i+i be a function from terms to terms such that for every 
t'i U, /i+i(t') is a term £i-equivalent to such that t' -^£\Exte(.n) /i+i(^i)- 
These functions /i+i must exist due to Lemma 10, since t' ^£ ti and ti -^TzjS 
ti+i implies t' -^tz/£ ti+i - Hence, t starts an infinite — >-f:\£; 2 ,tg( 7 ^)-reduction: 

t -^e\Ext£(TZ) fl(t) —^£\Ext^(1Z) /2(/l(t)) -^e\Ext£CR.) /3(/2(/l (t))) —^£\Ext£(TZ) ■ ■ ■ LI 

5 Dependency Pairs for Rewriting Modulo Equations 

In this section we finally extend the dependency pair approach to rewriting 
modulo equations: To show that TZ modulo £ terminates, one first constructs 
the extension Ext£{TZ) of TZ. Subsequently, dependency pairs can be used to 
prove well-foundedness of -^£\Exte(n) (which is equivalent to termination of TZ 
modulo £). The idea for the extension of the dependency pair approach is simply 
to modify Thm. 3 as follows. 

1 . The equations should be satisfied by the equivalence ^ corresponding to the 
quasi-ordering i.e., we demand u ~ v for all equations u ^ v in £. 

2. A similar requirement is needed for equations u ~ v when the root symbols 
of u and v are replaced by the corresponding tuple symbols. We denote 
tuples of terms Si, . . . , by s and for any term t = /(a) with a defined root 
symbol /, let t** be the term F{s). Hence, we also have to demand ~ u**. 

3. The notion of “defined symbols” must be changed accordingly. As before, all 
root symbols of left-hand sides of rules are regarded as being defined, but 
if there is an equation f{u) = g{v) in £ and / is defined, then g must be 
considered defined as well, as otherwise we would not be able to trace the 
redex in a reduction by only regarding subterms with defined root symbols. 

Definition 12 (Defined Symbols for Rewriting Modulo Equations). Let 

TZ he a TRS and let £ he a set of equations. Then the set of defined symbols T> 
of TZ/£ is the smallest set such that T> = {root{l) | / — >■ r G TZ\ U {root{v) \ u « 
V € £ or V u € £, root{u) G £>}. 

The constraints of the dependency pair approach as sketched above are not 
yet sufficient for termination of ~^£\tz as the following example illustrates. 

Example 13. Consider TZ = |f(a:) — >■ x} and £ = {f(a) « a}. There is no depen- 
dency pair in this example and thus, the only constraints would be f(a;) ^ x, 
f(a) ~ a, and F(a) ~ A. Obviously, these constraints are satisfiable (by using 
an equivalence relation ~ where all terms are equal). However, ~^£\tz is not 
terminating since we have a f(a) a ^£ f (a) a . . . 

The soundness of the dependency pair approach for ordinary rewriting (Thm. 
3) relies on the fact that an infinite reduction from a minimal non-terminating 
term can be achieved by applying only normalized instantiations of 7^-rules. But 
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for f-extended rewriting (or full rewriting modulo equations), this is not true 
any more. For instance, the minimal non-terminating subterm a in Ex. 13 is first 
modified by applying an ip-equation (resulting in f(a)) and then an 7?.-rule is 
applied whose variable is instantiated with the non-terminating term a. Hence, 
the problem is that the new minimal non-terminating subterm a which results 
from application of the 7^-rule does not correspond to the right-hand side of a 
dependency pair, because this minimal non-terminating subterm is completely 
inside the instantiation of a variable of the 7^-rule. With ordinary rewriting, this 
situation can never occur. 

In Ex. 13, the problem can be avoided by adding a suitable instance of the 
rule f(a;) — >■ x (viz. f(a) — >■ a) to TZ, since this instance is used in the infinite re- 
duction. Now there would be a dependency pair (F(a), A) and with the additional 
constraint F(a) > A the resulting inequalities are no longer satisfiable. 

The following definition shows how to add the right instantiations of the 
rules in TZ in order to allow a sound application of dependency pairs. As usual, 
a substitution v is called a variable renaming iff the range of v only contains 
variables and if v{x) yf v{y) for x ^ y. 

Definition 14 (Adding Instantiations). Given a TRS TZ, a set S of equa- 
tions, let TZ' be a set containing only rules of the form la — >■ ra (where a is a 
substitution and I ^ r & TZ) . TZ' is an instantiation of TZ for the equations £ iff 

(a) TZ C TZ', 

(b) for alll ^ r G TZ, all u v G £ and v ^ u G £, and all a G unis{v, 1), there 
exists a rule I' ^ r' G TZ' and a variable renaming v such that la I'v and 
ra r'v. 

In the following, let Inss{TZ) always denote an instantiation of TZ for £. 

Unlike extensions ExtsfTZ), instantiations InsgiTZ) are never infinite if TZ 
and £ are finite and if uni£{v,l) is always finite (i.e., they are not defined via a 
fixpoint construction). In fact, one might even demand that for all / — >■ r G 7^, all 
equations, and all a from the corresponding complete set of £l-unifiers, InssfTZ) 
should contain la -G ra. The condition that it is enough if some £l-equivalent 
variable-renamed rule is already contained in InssfTZ) is only added for efficiency 
considerations in order to reduce the number of rules in InssfTZ). Even without 
this condition, Ins£{TZ) would still be finite and all the following theorems would 
hold as well. 

However, the above instantiation technique only serves its purpose if there 
are no collapsing equations (i.e., no equations u « u or w « u with v G V). 

Example 15. Consider TZ = {f(x) — >■ x} and £ = {f(a;) « x}. Note that Ins£{TZ) 
= TZ. Although -G-£\ji is clearly not terminating, the dependency pair approach 
would falsely prove termination of -^£\'jz, since there is no dependency pair. 

Now we can present the main result of the paper. 

Theorem 16 (Termination of Equational Rewriting using Dependency 
Pairs). Let TZ be a TRS and let £ be a set of non-collapsing equations with iden- 
tical unique variables. TZ is terminating modulo £ (i.e., -Gtz/£ is well founded) if 
there exists a weakly monotonic quasi- ordering ^ and a well-founded ordering > 
compatible with ^ where both ^ and > are closed under substitution, such that 




104 J. Giesl and D. Kapur 



( 1 ) s > t for all dependency pairs (s,t) of Inss{Exts{TV)), 

( 2 ) I'fzr for all rules I ^ r ofTZ, 

( 3 ) u ^ V for all equations v of S, and 

(4) M** ~ v'^ for all equations u~v of £ where root{u) and root{v) are defined. 

Proof. Suppose that there is a term t with an infinite — >-7^/£-reduction. Thm. 
11 implies that t also has an infinite — >-£\£;a;ij(7^)-reduction. By a minimality 
argument, t = C[t'], where t' is an minimal non-terminating term (i.e., t' is 
non-terminating, but all its subterms only have finite —>■£: 4^(7^) -reductions). 
We will show that there exists a term ti with t -^'£\Exts('R) contains a 

minimal non-terminating subterm t'^, and ^ o > t'^ . By repeated application 
of this construction we obtain an infinite sequence t -^£\Ext£(n) ~^£\Exte{n) 

t2 ~^£\Extein) ■ ■ ■ ^ ° ^ ° ^ ° > This, however, is 

a contradiction to the well-foundedness of >. 

Let t' have the form f{u). In the infinite — >-f:\£;a;ij(7^)-reduction of f{u), first 
some — >-f:\£;£!;t£(7?,)-steps may be applied to u which yields new terms v. Note that 
due to the definition of f-extended rewriting, in these reductions, no f-steps can 
be applied outside of u. Due to the termination of u, after a finite number of 
those steps, an -^s\Exts{'Ti)~^l ^^'9 must be applied on the root position of f{v). 

Thus, there exists a rule I ^ r £ Exts{'R,) such that f{v) la and hence, 
the reduction yields ra. Now the infinite — >-£\£;a;ij(7^)-reduction continues with 
ra, i.e., the term ra starts an infinite — >-£\£;a;tg(7^)-reduction, too. So up to now 
the reduction has the following form (where -^Extein) equals -£n)' 

t = C[f{u)] 

^£\Exts (TZ) C[f{v)] C[la] ~^Exte(n) c[ra]. 

We perform a case analysis depending on the positions of £l-steps in f{v) la. 

First consider the case where all £l-steps in f{v) la take place below the 
root. Then we have I = f{w) and v wa. Let ti := C[ra\. Note that v do not 
start infinite (7^) -reductions and by Thm. 11 , they do not start infinite 

— >-7^/£-reductions either. But then wa also cannot start infinite — ^-T^/^-reductions 
and therefore they also do not start infinite — >-f:\£;a;(g(7^)-reductions. This implies 
that for all variables x occurring in f{w) the terms a{x) are terminating. Thus, 
since ra starts an infinite reduction, there occurs a non-variable subterm s in 
r, such that t[ := sa is a minimal non-terminating term. Since (l'^,s^) is a 
dependency pair, we obtain t'** = F{u) ^ F{v) ~ l^a > s^a = t'l . Here, F{u) ^ 
F{v) holds since u -^£\Exte{n) since / ^ r for every rule I ^ r £ Ext£{TZ). 

Now we consider the case where there are f-steps in f{v) la at the root 
position. Thus we have f{v) ^£ f{q) p ^£ la, where f{q) p is the first 
iP-step at the root position. In other words, there is an equation u ^ v or v ^ u 
in £ such that f{q) is an instantiation of v. 

Note that since v ^£ q, the terms q only have finite — >-£\£;a;ij(7^)-reductions 
(the argumentation is similar as in the first case). Let S be the substitution which 
operates like a on the variables of I and which yields vS = f{q). Thus, 6 is an 
£l-unifier of I and v. Since I is f-unifiable with v, there also exists a corresponding 
complete £l-unifier cr from uni£{l,v). Thus, there is also a substitution p such 
that 6 ap. As / is a left-hand side of a rule from Ext£{TZ), there is a rule 
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/' — >■ r' in Ins£{Ext£{TZ)) and a variable renaming ly such that la and 

ra r'v. 

Hence, vap v5 = f{q), I'vp lap 15 = la, and r'vp rap r5 = 
ra. So instead we now consider the following reduction (where -^inss(ExtsCR,)) 
equals -^n)' 

t = C[f{u)] -^*£\Extsin) C'[/(^)] c[l'vp] ^inseiExteiTZ)) C[r'up\ = h. 

Since all proper subterms of v6 only have finite — >- 7 ^/£-reductions, for all 
variables x of I'e, the term xp only has finite — ^-T^/^-reductions and hence, also 
only finite — >-f:\£;a;ij( 7 ^)-reductions. To see this, note that since all equations have 
identical unique variables, va la I'v implies that all variables of I'v also 
occur in va. Thus, if x is a variable from I'v, then there exists a variable y in 
V such that x occurs in ya. Since S does not contain collapsing equations, y is 
a proper subterm of v and thus, yS is a proper subterm of vS. As all proper 
subterms of vS only have finite — ^-T^/^-reductions, this implies that yS only has 
finite — ^-T^/^-reductions, too. But then, since yS yap, the term yap only has 
finite — J-T^/^-reductions, too. Then this also holds for all subterms of yap, i.e., 
all — >- 7 ^/f:-reductions of xp are also finite. 

So for all variables x of I', xvp only has finite — >-£\£;a;tg( 7 ^)-reductions. (Note 
that this only holds because v is just a variable renaming.) Since ra starts an 
infinite — >-£\£; 2 ,tj;( 7 ^)-reduction, r'vp ra must start an infinite — >- 7 ^/£-reduction 

(and hence, an infinite — >-£\£;a;ij( 7 ^)-reduction) as well. As for all variables x of 
r' , xvp is — >-£:\£; 2 ,tj.( 7 ^)-terminating, there must be a non- variable subterm s of 
r' , such that t'^ := .svp is a minimal non-terminating term. As {l'^,s'^) is a 
dependency pair, we obtain t'^ = F{u) ^ F{v) I'^vp > s'^vp = t'l^ Here, 
F{v) I'^vp is a consequence of Condition (4). □ 

Now termination of the division-system (Ex. 9) can be proved by depen- 
dency pairs. Here we have Ins£{Fxt£{TZ)) = Fxt£{TZ) and thus, the resulting 
constraints are 

M{s{x),s{y)) > M(x,j/) Q(0d-s(j/),z) > Q(0,2) 

Q{s{x),s{y)) > M(x,j/) Q(s(x) d-s(j/),z) > M(x,j/) 

Q{s{x),s{y)) > Q{x-y,s{y)) Q{s{x) ^ s{y) , z) > Q{x-y,s{y)) 

Q(s(x) d- s{y),z) > Q(s((x - y) d- s{y)),z) 

as well as ^ ^ r for all rules I — ^ r, {u ^ v) ^ w ~ {u ^ w) ^ v, and Q(m ^ 
v,w) ~ Q(m ^ w,v). (Here, M and Q are the tuple symbols for the minus- 
symbol ” and the quot-symbol As explained in Sect. 2 one may again 

eliminate arguments of function symbols before searching for suitable orderings. 
In this example we will eliminate the second arguments of — , -P, M, and Q 
(i.e., every term s — t is replaced by —'{s), etc.). Then the resulting inequalities 
are satisfied by the rpo with the precedence Zl s □ —', Q' Zl M'. Thus, 
with the method of the present paper, one can now verify termination of this 
example automatically for the first time. This example also demonstrates that 
by using dependency pairs, termination of equational rewriting can sometimes 
even be shown by ordinary base orderings (e.g., the ordinary rpo which on its 
own cannot be used for rewriting modulo equations). 
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6 Conclusion 

We have extended the dependency pair approach to equational rewriting. In the 
special case of AC-axioms, our method is similar to the ones previously presented 
in [15,17]. In fact, as long as the equations only consist of AC-axioms, one can 
show that using the instances Ins£ in Thm. 16 is not necessary.^ (Hence, such a 
concept cannot be found in [17]). However, even then the only additional inequal- 
ities resulting from Ins£ are instantiations of other inequalities already present 
and inequalities which are special cases of an HC-deletion property (which is sat- 
isfied by all known HC-orderings and similar to the one required in [15]). This 
indicates that in practical examples with HC-axioms, our technique is at least 
as powerful as the ones of [15,17] (actually, we conjecture that for HC-examples, 
these three techniques are virtually equally powerful). But compared to the ap- 
proaches of [15,17], our technique has a more elegant treatment of tuple symbols. 
(For example, if the TRS contains a rule f(ti, ^2) — >■ g(f(si, S2)) S 3 ) were f and g 
are defined HC-symbols, then we do not have to extend the TRS by rules with 
tuple symbols like f(ti,t2) — >■ G(f(si, S2), S2) in [17]. Moreover, we do not need 
dependency pairs where tuple symbols occur outside the root position such as 
. . .) in [17] and [15] and {F{ti,t 2 ), G(F(si, S2), S3)) in [15]. Finally, 
we also do not need the “HC-marked condition” F{f{x,y),z) ~ F{F{x,y), z) of 
[15].) But most significantly, unlike [15,17] our technique works for arbitrary 
non-collapsing equations £ with identical unique variables where ^^-unification 
is finitary (for subterms of equations and left-hand sides of rules). Obviously, 
an implementation of our technique also requires ip-unification algorithms [5] for 
the concrete sets of equations £ under consideration. 

In [1,2,3], Arts and Giesl presented the dependency graph refinement which 
is based on the observation that it is possible to treat subsets of the depen- 
dency pairs separately. This refinement carries over to the equational case in a 
straightforward way (by using ^^-unification to compute an estimation of this 
graph). For details on this refinement and for further examples to demonstrate 
the power and the usefulness of our technique, the reader is referred to [ 11 ]. 

Acknowledgments. We thank A. Middeldorp, T. Arts, and the referees for 
comments. 
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Abstract. Proving termination of a rewrite system by an interpretation 
over the natural numbers directly implies an upper bound on the deriva- 
tional complexity of the system. In this way, however, the derivation 
height of terms is often heavily overestimated. 

Here we present a generalization of termination proofs by interpretations 
that can avoid this drawback of the traditional approach. A number of 
simple examples illustrate how to achieve tight or even optimal bounds 
on the derivation height. The method is general enough to capture cases 
where simplification orderings fail. 



1 Introduction 

Proving termination of a rewrite system by an interpretation into some well- 
founded domain is perhaps the most natural approach among the numerous 
methods developed in the last few decades. Early references are f2 \ II 7j . but 
the idea of termination proofs by interpretations can at least be traced back to 
Turing I2II211. 

We are interested in extracting upper bounds on the derivational complexity 
of rewrite systems from termination proofs. Proving termination by an inter- 
pretation over the natural numbers always directly yields such a bound for the 
system [i2im. In particular, this is true for polynomial interpretations HHEl 
irn] (cf pj), but also for interpretations using other subrecursive functions classes 
as in ^]j, for instance. Upper bound results have been also obtained for most of 
the standard reduction orders like multiset path orders H3], lexicographic path 
orders m Knuth-Bendix orders |ibii5i2yiiij| . or for all terminating ground sys- 
tems d- All these results are proven either by combinatorial considerations or, 
more interestingly in the context of the present paper, by transforming termina- 
tion proofs via ‘syntactic’ orders into proofs via interpretations into the natural 
numbers. 

The major disadvantage of using monotone interpretations for upper bound 
results is that in this way the derivational complexity is often heavily overesti- 
mated. A natural question therefore is how to prove polynomial upper bounds 
(if they exist), or even asymptotically optimal bounds for polynomials of small 
degree. As one remedy it was suggested to consider syntactic restrictions of some 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 1 nS- Tim 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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standard termination orders like multiset or lexicographic path orders 12311. 
Another approach uses incremental termination proofs as described in jjj. In this 
paper, in contrast, we present a generalization of the notion of interpretations 
in order to obtain tight or even optimal upper bounds. For introductions to 
term rewriting, especially to termination of rewriting, we refer to 

An interpretation for a rewrite system R is an order-preserving mapping 
r from {Ts,^r) into some (partially) ordered set (A, >), that is, a mapping 
T ■. Ts ^ A satisfying C >,., where >r is the order on Ts induced by r 
(i.e., s >,- t iff r(s) > r(t)). Clearly, if (A, >) is well-founded then {Te,^r) is 
well-founded too, and we have a termination proof for R via r. 

Often, A has the structure of a A-Algebra, that is, for each n-ary symbol 
/ G A (n > 0) there is an associated mapping /,- : A” — >■ A. In this case the 
mapping r : Te A can be chosen as the unique A-homomorphism from the 
term algebra into A, where 

r(/(ti,...,t„)) = /^(T(ti), . . . ,r(t„)). ( 1 ) 

Such a homomorphic interpretation r into a A-algebra A is particularly con- 
venient if A is a strictly monotone ordered algebra (cf. El, p. 265), that is, if 
(A, >) is a poset, and all its operations are strictly monotone in each argument: 
aj > a' implies /,-(ai, . . . , , . . . , a„) > /,-(ai, . . . , at , . . . , a„) for ai,a' G A, 

/ G A„, 1 < j < n. Then >t is compatible with A-contexts: if s >r t then 
c[s] >T c\t] for ground terms s, t and ground contexts c. Thus proving termina- 
tion of a rewrite system R via r amounts to check €7 ry for each rule i ^ r 
in R and each ground substitution 7 only. In this situation, r is called a strictly 
monotone interpretation for R. 

We are interested in extracting bounds on the length of derivation sequences 
from termination proofs by interpretations. Let i? be a terminating rewrite sys- 
tem where is finitely branching (which is always true for finite systems). 
Then the derivation height function with respect to R on Te is defined by 

dh/i(t) = max{n G N | 3s : t — s}, 

that is, dhij(t) is the height of t modulo ^r, see 0. The derivational complexity 
of R is the function dc/j : N — >■ N with 

dci{(n) = max{dhfl(t) | size(t) < n}. 

In case we found a mapping r : Te — t N with -^r C that is, if 

s -^R t implies r(s) — r(t) > 1 ( 2 ) 

for s, t G Te, then clearly 



dhfl(t) < r(t). 



( 3 ) 
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Typically, however, the derivation height is heavily overestimated by this in- 
equality. As a simple introductory example consider the term rewrite system R 
with the one rule 



a(b(x)) —>■ b(a(x)) 

over a signature containing the two unary function symbols a, b and the constant 
symbol c (in order to guarantee the existence of ground terms). Here we obtain a 
termination proof by a strictly monotone interpretation over the natural numbers 
N = {0, 1, . . . } by choosing 

ar{n) = 2n, br{n) = 1 -|- n, Cr = 0. (4) 

Strict monotonicity of and br means ar{n + l) > ar(n) and br{n+l) > br{n) 
for n € N; equivalently, 

ar(n -I- 1) — ar(n) > 1, br(n -I- 1) — br(n) > 1. (5) 

It remains to show >r rj for the above rewrite rule £ —> r and any ground 
substitution 7. Indeed, T(a(b(t))) — T(b(a(t))) = 2(1 -|- T(t)) — (1 + 2r(t)) = 1 for 
t £ Te- Here, for example terms of the form a*b*c we get 

r(a"5™c) = 2" • m. 



whereas, as is easily seen. 



dh/{(a"6’”c) = n - m. 

Thus, r can be exponential in the size of terms while dc/j is polynomially 
bounded. The explanation is that, although rewrite steps at the root of terms 
cause a decrease of only 1 under the interpretation, rewrite steps further down 
in the tree exhibit an arbitrarily large decrease. For instance, the single rewrite 
step a^abc a^bac is reflected by r^a^abc) — r(a"5ac) = 2"+^ — 2" = 2". 

Before formally defining context-dependent interpretations in the next sec- 
tion we will already now illustrate their use in obtaining tight bounds on the 
derivational complexity for the introductory example. Crucial for our approach is 
that we consider a family of interpretations t[A] rather than a single one, where 
the parameteiQ A is a positive real number, that is, A £ R.+ . Furthermore, the 
domain N is replaced by the domain Mq of non-negative real numbers, and the 
standard well-ordering on N is replaced by a family of well-founded (partial) 
orderings on for A > 0 with 

z' >A z iff z' — z> A. 



^ It is a matter of taste whether we work with A > 0 or with <5, 0 < <5 < 1, by the 
bijection A 1 — 5/(1 — 5) and 5 1— >■ A/(l -I- A). 
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Similar to the traditional approach, if each mapping t[A\ : Ts — f Rq into the 
order (Mq, >/i) satisfies C >t[a], that is, if 

s^Rt implies r[Z\](s) — r[Z\](t) > Z\ d3) 

for s,t G Ts, then R is terminating and dh/j(t) < r[Z\](f)/Z\, hence 

dhij(t) < frf . 0) 

for t G Ts- In order to guarantee 0 we again proceed in analogy to the tradi- 
tional approach. We generalize ([IJ in defining the function r : K+ x Ts — f 

by 

r[Z\](/(ti,...,t„)) = fr[A]{T[fXA)]{ti), . . . ,T[f^{A)]{tn)) , (P) 

using functions fr ■ K.'’" x (R.^)” — f Rq and /* : R+ — >■ R+ for each n-ary symbol 
/ G H and 1 < i < n. (We write r[Z\](t) instead of t(A, t) and fr[A]{zi , . . . , Zn) 
instead of fr{A, zi, Zn) to emphasize the special role of the first argument.) 
Returning to our example, we modify and obtain 

ar[A]{z) = {1 + A)z, br[A]{z) = 1 + Z, Cr[A] = 0. P) 

This will be the only ‘creative’ step in finding a context-dependent interpretation 
for R. But how then to choose the functions a), and 6).? As we will explain in 
more detail in the next section, A-monotonicity of fr is required in our approach. 
Informally, this says that fr [A] propagates a difference of at least A provided a 
difference of at least fl{A) in argument position i is given: 

ar[A]{z + al{A)) - ar[A]{z) > A, 
br[A]{z + bl{A))-br[A]{z)>A. 

Solving dSD gives (1 -b A){z + a\{A)) - (1 -b A)z = (1 + A)a\{A) > Z\, that is, 
al{A) > A/{1 + z4), and {1 + z + bl(A)) — (1 -b z) > Z\, that is, bl(A) > A. 
Therefore, the most natural choice is 

To sum up, we found in a rather systematic way the following interpretation: 

r[A]{a{t)) = il + A)-r[^]{t), 

r[A]{b{t)) = l + r[A]{t), 
t[A]{c) = 0. 

All that remains to show is T[Z\](a(6(t))) — T[A]{b{a{t))) > Z\ for t G Te- Indeed, 

T[A]{a{b{t)))-T[Amam = 

(1 + A)(l + r[^](t)) - (I + (1 + A)r[^^]{t)) = A. 
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For the terms a'^b^c discussed above we obtai 






T[A]{a^b'^c) = {1 + An)m, 



thus 



T[A]{a^"^c) 
zi>o A 



inf (— + n)m = n ■ 

A>0^ A 



dhfi(a”6™c). 



So for these terms, the above context-dependent interpretation does yield the 
precise derivation height. Even better yet, this is true for all ground terms; a 
proof of inf/i>o T[Z\](t)/Z\ = dhfl;(t) for t S Ti can be found in appendix 15.11 
(the present system R corresponds to i?i there, cf Section tl.2ll . 



2 Context-Dependent Interpretations 

In this section, context-dependent interpretations are introduced more formally, 
and it is shown how the traditional approach appears as a special case. 

A context-dependent interpretation consists of functions fr ■ R’*’ x (Mq)" — t 
and /* : R'*' — >■ R'*' for each n-ary symbol f G S and 1 < i < n. They induce 
a function r : R+ x Ti; — t R q by 

r[A]{f{h,. . . , t„)) = fr[A] {r[fUA)]{h), . . . , r[/;‘(Z\)](t„)). (D) 

We always assume A-monotonicity of fr, that is, 

if z', - z,> fl{A) 

then fr[A]{zi, . . . , 2 ', . . . , z„) - fr[A]{zi, . . . , Zi, . . .,Zn)> A (6) 

for z[,Zi G Rq. Note that this does not imply weak monotonicity of /r[A] (seen 
as n-ary function). However, if weak monotonicity is given, that is, if > Zi 
implies fr[A]{. . . , z^, . . .) > fr[A]{. . . , Zi, . . .), then 0 specializes to 

fr[A]{zi,...,Zi + fl(A),...,Zr,) - fr[A]{zi,...,Z„...,Zn) > A. 

Further assume that r is compatible with a rewrite system R, that is, for each 
rule £ ^ r in R and each ground substitution 7, 

t\A]{£'^) — r[Z\](r7) > A. (7) 

Under these two conditions the following property holds. 

Lemma 1. For s,t G Ts and A G R“*", 

implies r[Z\](s) — r[Z\](t) > Z\. 

^ By induction: Clearly t[A](6™c) = m and r[A](a""'"^fe"'c) = (l+A)T[j^^](a"b^c) = 
(1 + A)(1 -|- (by the induction hypothesis) = (1 -|- A{n + l))m. 
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Proof. By induction on s. If the rewrite step takes place at the root position in 
s, the claim follows from ( 0 . Otherwise, assume a rewrite step at position i.p in 
s. Then s = /(si, . . . , , s„), t = /(si, . . . , tj, . . . , s„), and Si U. From 

the induction hypothesis, 

r[/;(Zi)](s.)-r[/;(Zi)](t,)>/;(Zi). 

Therefore, by Z\-monotonicity, 

r[Z\](s) - r[A]{t) = r[A]{f {. . . , . . . )) - r[A]{f {. . . , . . . )) 

> 



□ 



Proposition. Let t be a eontext- dependent interpretation eompatible with a 
rewrite system R. Then R is terminating and holds. 



Remark. A more general version of this result is obtained by choosing an ap- 
propriate domain D C R+ such that Z\-monotonicity (0 holds for A € D and 
S such that ^-compatibility Q holds for A € D, and 

such that fr{D) C D for each function /*. Then by the same reasoning we get 



dhfl(t) < 



inf 

Agd 



rim) 

A 



2.1 A Special Case 

The traditional (non context-dependent) approach appears as a special case. 
Let us assume that termination of a rewrite system R is provable by a strictly 
monotone interpretation over the natural numbers, that is, for 

rules £ — >■ r in i? and ground substitutions 7 , where r : N — >■ N is induced by a 
family of strictly monotone functions fr {f G S). Extend each n-ary function 
fr from N to R’J), preserving not only strict monotonicity but also the prop- 
erty (which over the natural numbers coincides with strict monotonicity) that 
fr{. . . ,Zi + 1, . . .) — fr{. . . ,Zi, . . .) > 1 ; this is always possible by choosinj^ 

/*r(^l ^1; • ■ • ; ^n) — 

n 

(/r(*l + bi,...,kn + bn)-Y[ ((1 “ bi){l - X^) + hxf)^ 

biG{0,l} i=l 

for ki G N and 0 < < 1. Now define functions frlA] and /* by 

fr[A]{zi, . . . , Zn) = Afr{zilA,...,ZnlA), 

Then we get the following properties. 

For instance, for n = 2 we get friki + Xi,k2 + X2) = friki, fe))! — *i)(l — X2) + 

/r(fcl, k2 + 1)(1 - Xl)X2 + friki + 1 , k2)xi{l - X2) + frfkl + I,k2 + l)xiX2- 



3 
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Lemma 2. (i) T[A]{t) = AT{t), hence inf/i>o r[Z\](t)/Z\ = r(t) for t G Ts- 
(a) T is compatible with R &■ (Hi) A-monotonicity holds true. 

Proof, (i) By induction on t: 

r[A]{f{...A,...)) = fAA]{...,r[n{A)]{U),...) 

= fr[A]{...,T[A]{ti),...) 

= fr[A ]{. . . , AT{ti), ...) (by ind. hypothesis) 

= Afr{...,T{t,),...) 

= AT{f{...,U,...)). 

(ii) We know t(£ 7 ) — r(r 7 ) > 1, thus t[Z\](£ 7 )— r[Z\](r 7 ) = AT{i'f) — AT{rj) > A 
by (i). (iii) Strict monotonicity of /t[^] is implied by strict monotonicity of /i-, 
so it suffices to show /t-[Z\](. . . ,Zi + f){A), . . .) — fr[A]{. . . , Zi, . . .) > A. This is 
equivalent to Afr{. . . ,(zi + A) /A, . . .) — Afr {. . . , Zi/A , . . . ) > Z\, thus follows 
from fr{. . . , ZijA + 1, . . . ) - /r(- ■ ■ , ZijA , . . . ) > 1- □ 

3 More Examples 

3.1 Associativity 

As another example we study termination and derivational complexity of the 
associativity rule 



{x o y) o z ^ X o {y o z) 

over a signature containing the binary function symbol o and the constant symbol 
c (again in order to guarantee the existence of ground terms); let us call the 
system R, as always. Termination by a strictly monotone interpretation r : 
Ti — t N can be easily checked for the choice 

°r(«l, «2) = 2ni + n2 + 1, Cr = 0. 

We use the same heuristics we found useful before: Replace in the interpretation 
functions (some occurrence of) k + 1 hy k + A. This first step in finding a 
context-dependent interpretation yields 

0^[A]{zi,Z2) = (1 + A)zi + Z2 + 1, Cr[A] = 0. 

And again, the functions o( can be found by solving the Z\-monotonicity require- 
ments 



°r[A]{zi +o\{A),Z2) - °t[A]{zi,Z 2) > A, 
r[A]{zi,Z 2 + ol(A)) - Or[A]{zi,Z 2 ) > A. 



o. 
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We get {{\ + A){zi + o\{A)) + z 2 + l)-{{^^ + ^)zi + Z 2 + l) = (H-Z\)o^(Z\) > A, 
that is, o^(zi) > A/{l+A),&TLd{{l+A)zi+Z 2 +ol{A) + l)-{{l+A)zi+Z 2 +l) = 
o^(Z\) > A, therefore we choose 

o'(^) = r^, oi{A) = A, 

resulting in the following context-dependent interpretation: 

T[A]{sot) = (1-|-Z\) ■ t[y^-^]{s) + T[A]{t) + 1, t[A]{c) = 0. 

Lemma 3. For r,s,t G Th, r[Z\]((r o s) o t) — r[Z\](r o (s o t)) > A. 

Proof. By definition of r, 

r[Z\]((r o s) ot) — r[Zi](r o (s o t)) = 

((l-b/\)-r[^-^^](ros)-br[Z\](t)-bl)-((l-bZi)-T[^-^^](r)-bT[Z\](sot)-bl) = 

(<i + + "'rrai w + + "'"'K'' + ') - 

((1 + Z\) . r[^]{r) + {1 + A)- r[^]{s) + r[A]{t) + 2 ) = 

(1 -f 2Zl)r[^^](r) + - (1 + A)r[^]{r). 

Thus, we have to prove 

w > 0 . (8) 

By induction on r: For r = c we get 0 — 0 > 0. Note that this is the only place 
where r[Z\](c) comes into play. For r = s ot we get 

(1 -b o t) - (1 -b Z\)r[^-^^](s ot) = 

((1 + 2Z\)r[^^](t) - (1 + A)r[^]{t)) + ((1 + 2Z\) - (1 + A)) . 

The first and the second difference are non-negative by the induction hypothesis 
(for the first one substituting Z\/(l -b Z\) for A in (0), and the third difference 
is non-negative as Z\ > 0. □ 

As example terms consider Rn, Ln (right and left combs respectively), and 
Bn (full binary trees), defined by i?o = Lq = Bq = c and 



Rn+l — CO Rn^ 



Ln+1 — Lji O C, 



Bn+1 — C Bn- 
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It is not difficult to verify that r[Z\](i?„) = n, 

r[Zi](L„) = n + r[Z\](i3„) = 2" - 1 + A{2^~\n - 2) + 1), 

thus by our key observation we obtain dh/{(i?„) < 0, 

dhK(L„) < dhfl(i?„) <2"-i(n-2) + l. 

This upper bound is obviously optimal for i?„. That this is also true for L„ and 
Bn could easily be seen by considering innermost reductions. In this example, 
however, we can again more generally prove inf/i>o T[Z\](t)/Z\ = dh/{(t) for all 
ground terms t, see appendix 15. 2L 



3.2 Introductory Example, Cont’d 

Up to now we have seen examples with polynomially bounded derivational com- 
plexity only. A generalization of the first example will lead to (tight) exponential 
bounds. Consider a family of one-rule systems Rk for fc > 0 with the rule 

a{b{x)) — >■ 6^(a(a;)). 

A straightforward stictly monotone interpretation for Rk is ar(n) = (k + l)n, 
br(n) = 1 + n, Cr = 0. This suggests the context-dependent interpretation 



an(n) = (k + A)n, bn(n) = 1 + n, Cn = 0. 

Again, solving (Q) gives (k + A)(z + a(.(A)) — (k + A)z = (k + Z\)a^(A) > A 
and (1 -I- z -I- bn(A)) — (1 -|- z) > A, thus we choose 

and obtain the context-dependent interpretation t[Z\](c) = 0, 

r[A]{a{t)) = {k + A)- r[^](t), r[A]{b{t)) = 1 + r[A]{t). 

Verification of T[A]{a(b(t))) — r[Z\](6^(a(t))) = A is simple. For terms a^b"^c we 
obtain, for fc > 1, 

jUn _ 1 

r[A](a”&’"c) = + • A)m, 

fc — 1 



thus 



. r[A](a"5'"c) 

inf — — - — T 

z\>o A 



fc" - 1 
fc- 1 



• m = dh/jj, (a"6™c). 



For a proof of inf/i>o r[A](t)/A = dhn^.{t) for all terms t see appendix 15.11 
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3.3 Where Simplification Orderings Fail 

Here we show that our approach is general enough to handle rewrite systems that 
are not simply terminating. The example below illustrates that this is possible 
even without involving context dependency, due to weak monotonicity and A- 
monotonicity for fixed A. Consider 

a(a(x)) —>■ a(b(a(x))) 

over signature {a, b, c} from the introductory example. First, define (non context- 
dependent) interpretation functions aT,br : R")) — >■ R’g by 

ar{z) = n + 1/2 if n — 1 < z < n, 
br{z) = n if n — 1/2 < 2 ; < n -I- 1 / 2 , 

where z G Mq, n G N, and let Cr = 0. Both functions are weakly monotone, and 
‘ 1 -monotonicity’ holds: 

ar{z + 1) — ar(z) > 1, br{z + 1) — br{z) > 1. 

Note that over natural numbers, the requirement bT-(n -I- 1) — br{n) > 1 implies 
brin) > n. This is not the case for the domain of real numbers. For instance, 

6 , -(l/ 2 ) = 0 , thus r( 6 (a(c))) < r(o(c)), violating any kind of subterm property. 

Lemma 4. r(a(a(t))) — r(a(&(a(<)))) > 1 for t G Ts- 

Proof. We consider two cases as t{Ts) = {n, n-|-l/2 | n G N}. For z = n-|-l/2 we 
have ar(ar(z)) = ar{n + 3/2) = n + 5/2 and ar{br{ar{z))) = ar{br{n -1-3/2)) = 
Or(n -I- 1) = n -I- 3/2, for z = n we have ar{ar{z)) = ar{n + 1/2) = n + 3/2 and 

07 . ( 61 - (ot-(z))) = ar{br{n + 1 / 2 )) = ar{n) = n + 1 / 2 . □ 

Now, similar to the considerations in section 12. II this can easily be turned into 
a context-dependent interpretation in the proper sense (without really depending 
on the context, though). Thus the rewrite system is terminating, and dh/{(t) < 
r(t). This bound is optimal as we have 

dhfi(t) = Lr(t)J. 



4 Conclusion 

By the approach presented in this paper one can sometimes obtain tight upper 
bounds on the derivational complexity of rewrite systems. We think that there 
is a certain potential to further extend this idea. In particular, we believe that 
semi-automatic assistance in finding complexity bounds might be possible by a 
two step procedure as follows. First, find the interpretation, for instance using 
variants of methods presented for traditional interpretations. We have seen that 
in a number of simple examples, variants of automatically generated polyno- 
mial interpretations are appropriate; heuristics for finding termination proofs 
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by interpretations or their implementation is addressed in |2l28l25l26l2Yll0lfH . 
among others. Second, extract complexity bounds from the interpretation. For 
recursively defined families of terms, or for all terms, a computer algebra system 
might be helpful here. 

We finally want to point out that a-priori-knowledge of the derivation 
lengths function is only used for proving that the bounds obtained for the few 
simple examples are optimal. Our approach allows one to obtain better upper 
bounds than previously just without that knowledge, therefore these proofs are 
relegated to the appendix. 



Acknowledgements. I am grateful to the referees for their helpful remarks. 
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5 Appendix 

5.1 Introductory Example 

In this section let R abbreviate Rk, and let dh and I denote dh/j^ and Ir,,. 
respectively (in case term t has a unique normal form with respect to R, as 
always in this example, it is denoted by We prove the following claim 

for A G ffi.’*' and t G Rs by induction on t, where \t\f denotes the number of 
occurrences of symbol / in term t: 

T[A]{t) = \ti\b + Z\dh(t). 

Proof. For t = c we get r[Z\](c) = 0 = |oHf, + Z\dh(c) as |c|,|b = dh(c) = 0. For 
t = b{s) we know |6(s)4.|b = 1 + |s4,|b and dh(&(s)) = dh(s). Thus, using the 
induction hypothesis, 

r[Z\](6(s)) = l + r[Z\](s) 

= 1 + |s||{, + Z\dh(s) 

= |&(s)i|h + Z\dh(&(s)). 

For t = a{s) we know |a(s)4|6 = A:|s4,|h and dh(a(s)) = dh(s) + |s4,|{, from 
sj, G b*a*c. Hence, using the induction hypothesis, 

r[A]{a{s)) = {k + A)r[j^]{s) 

= {k + z\)(|4|6 + 

= fc|4|b + z\dh(s) + Z\|4|h 
= |a(s)i|b + Z\dh(a(s)). 



□ 



As a direct consequence. 



inf 

zi>o 



rim) 

A 



inf 

zl>0 




- dh(t) 



= dh(t). 
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5.2 Associativity 

Similarly, we prove for A € and t € Tb, where dh abbreviates dh/j, 

r[Z\](t) < \t\c — 1 + Z\dh(t). 

Proof. For t = c we have r[Z\](c) = 0 = |c|c — 1 + Z\dh(c) as |c|c = 1 and 
dh(c) = 0. For t = r o s we know that 

dh(r o s) > dh(r) -|- dh(s) -|- \r\c — 1 

since dh(ros) > dh(r) -|-dh(s) -|-dh(i?p|^_i and dh(i?„ot) > n-l-dh(t) 

and dh(ii„) = 0. (Note that is the unique normal form of a term t.) 

Thus, applying the induction hypothesis twice, 

T[A]{r o s) = (1 -b A)T[Y^-^]{r) + r[Z\](s) -b 1 

< (1 -b A) (^|r|c - 1 + ^ ^^ dh(r)^ -b |s|c - 1 + Adh(s) -b 1 

= \r\c + |s|c - 1 + A(dh(r) -b dh(s) -b \r\c - 1) 

< |r o s|g — 1 -b Adh(r o s). 



□ 



Again, as a consequence we obtain 



inf 

/ 1>0 



r[m) 

A 



< inf 
zi>o 




dh(t) 



dh(t). 
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Abstract. A rewrite system is called uniformly normalising if all its 
steps are perpetual, i.e. are such that if s — >■ t and s has an infinite reduc- 
tion, then t has one too. For such systems termiuation (SN) is equivalent 
to normalisation (WN). A well-known fact is uniform normalisation of 
orthogonal non-erasing term rewrite systems, e.g. the A7-calculus. In the 
present paper both restrictions are analysed. Orthogonality is seen to 
pertain to the linear part and non-erasingness to the non-linear part of 
rewrite steps. Based on this analysis, a modular proof method for uni- 
form normalisation is presented which allows to go beyond orthogonality. 
The method is shown applicable to biclosed first- and second-order term 
rewrite systems as well as to a A-calculus with explicit substitutions. 



1 Introduction 

Two classical results in the study of uniform normalisation are: 

— the A/-calculus is uniformly normalising jjj p. 20, 7 XXV], and 

— non-erasing steps are perpetual in orthogonal TRSs m Thm. II.5.9.6]. 

In previous work we have put these results and many variations on them in a 
unifying framework m- At the heart of that paper is the result (Thm. 3.16) 
that a term s not in normal form contains a redex which is external for any re- 
duction from sQ Since external redexes need not exist in rewrite systems having 
critical pairs, the result does not apply to these. The method presented here, is 
based instead on the existence of redexes which are external for all reductions 
which are permutation equivalent to a given reduction. Since this so-called stan- 
dardisation theorem holds for all left-linear rewrite systems, with or without 
critical pairs, the resulting framework is more general. It is applied to obtain 

^ According to im p. 404], a redex at position p is external to a reduction if in the 
reduction no redex is contracted above p to which the redex did not contribute. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 1 22- Tn?1 2001. 
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uniform normalisation results for abstract rewrite systems (ARSs), first-order 
term rewrite systems (TRSs) and second-order term rewrite systems (I 2 RS) in 
Sect. 13 0 and 0 respectively. In each section, the proof method is presented 
for the orthogonal case first, deriving traditional results. We then vary on it, 
relaxing the orthogonality restriction. This leads to new uniform normalisation 
results for biclosed rewrite systems (e.g. Cor. 1300 and0. In Sect. 0 uniform 
normalisation for Ax“, a prototypical A-calculus with explicit substitutions, is 
shown to hold, extending earlier work of |E| who only shows it for the explicit 
substitution part x of the calculus. The proof boils down to an analysis of the 
(only) critical pair of Ax“ and uses a particularly simple proof of preservation 
of strong normalisation for Ax“ , also based on the standardisation theorem. 

2 Abstract Rewriting 

Although trivial, the results in this section and their proofs form the heart of 
the following sections. Moreover, they are applicable to various concrete (linear) 
rewrite systems, for instance to interaction nets m- The reader is assumed to 
be familiar with abstract rewrite systems (ARSs, [I bf Chap. 1] or P] Chap. 2]). 
Definition 1. Let a be an object of an abstract rewrite system, a is terminating 
(strongly normalising, SNj if no infinite reductions are possible from it. We use 
00 to denote the complement o/SN. a is normalising (weakly normalising, \NN) 
if some reduction to normal form is possible from it. 

Definition 2. A rewrite step s ^ t is critical if s £ 00 and t G SN, and 
perpetual otherwise. A rewrite system is uniformly normalising if there are no 
critical steps. 

First, note that a rewrite system is uniformly normalising iff WN C SN holds. 
Moreover, uniform normalisation holds for deterministic rewrite systems. 
Definition 3. A fork in a rewrite system is pair of steps t\ s ^ t 2 . It is 
called trivial if t\ — t 2 . A rewrite system is deterministic if all forks are trivial, 
and non-deterministic otherwise. 

To analyse uniform normalisation for non-deterministic rewrite systems it thus 
seems worthwhile to study their non-trivial forks. 

Definition 4. A rewrite system is linear orthogonal if every fork ti <— s ^ t 2 
is either trivial or square, that is, t\ ^ s' G- t 2 for some s' jZl Exc. 2.33]. 

We will show the fundamental theorem of perpetuality: 

Theorem 1 (FTP). Steps are perpetual in linear orthogonal rewrite systems. 
Corollary 1. Linear orthogonal rewrite systems are uniformly normalising. 

In the next section we will show (Lem. 01 that the abstract rewrite system 
associated to a term rewrite system which is linear and orthogonal, is linear 
orthogonal. Linear orthogonality is a weakening of the diamond property 0 
Def. 2.7.8], and a strengthening of subcommutativity ^3 Def. 1.1. (v)j and of 
the balanced weak Church-Rosser property EH) Def. 3.1], whence: 
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Proof, (of Thm. ^ Suppose s G oo and s — ?> f. We need to show t G oo. By 
the first assumption, there exists an infinite reduction S' : Sq — t si — > S 2 — t 
with So = s. One can build an infinite reduction T from t as follows: let to = t 
be the first object of T. By orthogonality we can find for every non-trivial fork 
Si+i ^ Si — >■ a next object ti+i of T such that Si+i — >■ ti+i ^ U. Consider 

a maximal reduction T thus constructed. If T is infinite we are done. If T is 
finite, it has a final object, say tn, and a fork s„+i ■<— s„ — >■ exists which is 

trivial, i.e. s„+i = Hence, T and the infinite reduction S from s„+i on can 
be concatenated. □ 

FTP can be brought beyond linear orthogonality. Let — and ^ denote the 
reflexive and reflexive-transitive closure of — respectively. 

Definition 5. A fork •<— s — >■ ^2 closed if ti ^ t 2 - A rewrite system is 
linear biclosed if all forks are either closed or square^ 

By replacing the appeal to triviality by an appeal to closedness in the proof 
of FTP, i.e. by replacing s„+i = by s„+i we get: 

Corollary 2. Linear biclosed rewrite systems are uniformly normalising. 

3 First-Order Term Rewriting 

In this section first the uniform normalisation results of Section|2|are instantiated 
to linear term rewriting. Next, the fundamental theorem of perpetuality for first- 
order term rewrite systems is established: 

Theorem 2 (FjTP). Non-erasing steps are perpetual in orthogonal TRSs. 
Corollary 3. Non-erasing orthogonal TRSs are uniformly normalising. 

The chief purpose of this section is to illustrate our proof method based on 
standardisation. Except for the results on biclosed systems, the results obtained 
are not novel (cf. Lem. 8.11.3.2] and |3 Sect. 3.3]). The reader is assumed 
to be familiar with first-order term rewrite systems (TRSs) as can be found in 
e.g. uni or p. We summarise some aberrations and additional concepts: 

Definition 6. — A term is linear if any variable occurs at most once in it. Let 

g : I ^ r be a TRS rule. Lt is left-linear (^right-linear j if I (r) is linear. Lt 
is linear if Var{l) = Varfr) and both sides are linear. A TRS is (left-, right) 
linear if all its rules are. 

— Let g : I ^ r be a rule. A variable x G Var(l) is erased by g if it does not 
occur in r. The rule g is erasing if it erases some variable. A rewrite step is 
erasing if the applied rule is. A TRS is erasing if some step is. 

— Let g : I ^ r and d : g ^ d be rules which have been renamed apart. Let p 
be a non-variable position in I . g is said to overlap d at p if a unifier a of 
l\p and g does exist. Lf a is a most general such unifier, then both (Z[d]p,r°’) 
and (r®’, l[d\p) are critical pairs at p between g and d. 0 

^ Beware of the symmetry: if the fork is not square, then both ti -» ±2 and t 2 -» ti. 

^ Beware of the symmetry (see the next item and cf. Footnote 2). 
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— If for all such critical pairs (^ 1 ,^ 2 ) of a left-linear TRS TZ it holds that: 



ti — w S^ i — ^2; 

ti — » t2, 

t\ = t2, 

t\ = t 2 and p = e, 



then TZ is strongly closed }1(A p. 812] 
then TZ is biclosed p. TO] 
then TZ is weakly orthogonal 
then TZ is almost orthogonal 



t\ = t 2 ; P = ^ Q = d, then TZ is orthogonal 



Some remarks are in order. First, our critical pairs for a TRS are the critical 
pairs (s,t) of P Def. 6.2.1] extended with their opposites {{t,s)) and the trivial 
critical pairs between a rule with itself at the head ((r, r) for every rule Z — >■ r). 
Next, linearity in our sense implies linearity in the sense of P Def. 6.3.1], but 
not vice versa. Linearity of a step s = C[l’^] — C'[r'^] = t as defined here captures 
the idea that every symbol in the context-part C or the substitution-part cr in 
s has a unique descendant in t, whereas linearity in the sense of P Def. 6.3.1] 
only guarantees that there is at most one descendant in t. Remark: 



orth. => almost orth. => weakly orth. biclosed => strongly closed 



3.1 Linear Term Rewriting 

In this subsection the results of Section 0 for abstract rewriting are instantiated 
to linear term rewriting. First, remark that linear strongly closed TRSs are 
confluent (combine Lem. 6.3.2, 6.3.3 and 2.7.4 of P). Therefore, a linear TRS 
satisfying any of the above mentioned critical pair criteria is confluent. 

Lemma 1. If TZ is a linear orthogonal TRS, is a linear orthogonal ARS. 

Proof. The proof is based on the standard critical pair analysis of a fork ti 
s ~^n ^2 as in P Sect. 6.2]. Actually, it is directly obtained from the proof of P 
Lem. 6.3.3], by noting that: 

Case 1 (parallel) establishes that the fork is square (joinable into a diamond). 
Case 2.1 (nested) also yields that the fork is square □ and 
Case 2.2 (overlap) can occur only if the steps in the fork arise by applying the 
same rule at the same position, by orthogonality, so the fork is trivial. □ 

From Lem. Pand Cor. Pwe obtain a special case of Corollary 0 
Corollary 4. Linear orthogonal TRSs are uniformly normalising. 

Lemma 2. If TZ is a linear biclosed TRS, is a linear biclosed ARS. 

Proof. The analysis in the proof of Lem. P needs to be adapted as follows: 

Case 2.2 , the instance of a critical pair, is closed by biclosedness of critical 
pairs and the fact that rewriting is closed under substitution. □ 

Corollary 5. Linear biclosed TRSs are uniformly normalising. 

Note that the case x ^ Var{r\) cannot happen, due to our notion of linearity. 
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3.2 Non-linear Term Rewriting 

In this subsection the results of the previous subsection are adapted to non-linear 
TRSs, leading to a proof of FjTP (Thm. El). The adaptation is non-trivial, since 
uniform normalisation may fail for orthogonal non-linear TRSs. 

Example 1. The term e(a) in the TRS {a — >■ a, e(x) — >■ b} witnesses that orthog- 
onal TRSs need not be uniformly normalising. 

Non-linearity of a TRS may be caused by non-left-linearity. Although non-left- 
linearity in itself is not fatal for uniform normalisation of TRSs (see 0 Chap. 3], 
e.g. Cor. 3.2.9), it will be in case of second-order rewriting (cf. Ex. EJ and our 
method cannot deal with it. Hence: We assume TRSs to be left-linear. 

Under this assumption, non-linearity may only be caused by some symbol having 
zero or multiple descendants after a step. The problem in Ex. Qis seen to arise 
from the fork e(a) <— e(a) — >■ b which is not balancedly joinable: it is neither 
trivial (e(a) ^ b) nor square {$s' e{a) — >■ s' ^ 6). Erasingness is the only problem. 
To prove F]TP, we will make use of the apparent asymmetry in the non-linearity 



string 



term 




Fig. 1. Split 



of term rewrite steps: an occurrence of a left-hand side of a rule I — >■ r splits the 
surrounding into two parts (see Fig.Q): 

— the context-part above or parallel to ^ Def. 3.1.3] I, and 

— the argument-paxt, below 1. 

Observe that term rewrite steps in the context-part might replicate the occur- 
rence of the left-hand side I, whereas steps in the argument-part cannot do so. 
To deal with such replicating steps in the context-part, we will actually prove a 
strengthening of FjTP for parallel steps instead of ordinary steps. 

Definition 7. Let o: I ^ r be a TRS rule, s parallel rewrites to t using g, 
s -(Hg t fW\ p. 814-j]^ if it holds that s = C[E ^ , . . . , P'“] and t = C[r ^^ , . . . , r‘^'=], 
for some k > 0. The step is erasing if the rule is. The context(' argument^ -part 
of the step is the part above or parallel to all (below some) occurrences of I . 

Actually our notion is a restriction of his, since we allow only one rule. 



5 
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To reduce F|TP to FTP it suffices to reduce to the case where the infinite re- 
duction does not take place (entirely) in the context-part, since then the steps 
either have overlap or are in the, linear, argument-part. To that end, we want 
to transform the infinite reduction into an infinite reduction where the steps in 
the context-part precede the steps in the argument-part. 



Definition 8 . A reduction is standard (see Fig. 13) if for any step C[P]p — >■ 
C[r'^]p in the reduction, p is in the pattern of the first step after that step which 
is above p. That is, ifD[g'^]g displays the occurrence of the first redex with p = go, 
we have that o is a non-variable position in g. 

Theorem 3 (STD). Any reduction in a TRS can be transformed into a stan- 
dard one. The transformation preserves infiniteness. 

Proof. The first part of the theorem was shown to hold for orthogonal TRSs 
in im Thm. 3.19] and extended to left-linear TRSs possibly having critical pairs 
in 0. That standardisation preserves infiniteness follows from the fact that at 
some moment along an infinite reduction S' : sq — >■ si — >■ . . . a redex at minimal 
position p w.r.t. the prefix order < ^ Def. 3.1.3] must be contracted. Say this 
happens the first time in step Si — s^+i. Permute all steps parallel to p in S 
after this step resulting in Sq; Si, where Sq contains only steps below p and ends 
with a step at position p, and Si is infinite. Standardise Sq into Tq, note that 
it is non-empty and that concatenating Tq with any standardisation of Si will 
yield a standard reduction by the choice of p. Repeat the process on Si . □ 

Proof, (of Thm. |3) Suppose s G oo and s t is non-erasing, contracting k 
redexes w.r.t. rule g : I ^ r in parallel. We need to show t G oo. If fc = 0, then 
t = s G oo. Otherwise, there exists by the first assumption an infinite reduction 
S : So ~^qo Si —>-51 S 2 — with sq = s and Si s^+i contracting a redex 
at position qi w.r.t. rule di \ gi ^ di. By STD S may be assumed standard. 
Consider the relative positions of the redexes in the fork si s ~]Hg t. 

(context) If po occurs entirely in the context-part of the parallel step, then 
by the Parallel Moves lemma ^ Lem. 6.4.4] the fork is joinable into si 




path to p untouched 
by reduction 



pattern of g at position q 
overlaps position p 




Fig. 2. Standard 
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ti <—qo to. Since to — >■ ti, si G oo, and si -fHg ti is non-erasing, repeating 
the process will yield an infinite reduction from to = t as desired, 
(non-context) Otherwise go must be below one or overlap at least one con- 
tracted left-hand side I, say the one at position p. Hence, s -jj-)- t can be 
decomposed as s — s' t. We claim s' £ oo. The proof is as for 

FTP, employing standardness to exclude replication of the pivotal /-redex. 
Construct a maximal reduction T as follows. Let to = s' be the first object 
of T. If go overlaps the I at position p, then T is empty. Otherwise, go must 
be below that I and we set oo = go- 

— Suppose the fork s^+i -<^q. Si — ti is such that the contracted redexes 
do not have overlap. As an invariant we will use that Oi records the out- 
ermost position below I (at p) and above qo where a redex was contracted 
in the reduction S up to step i, hence p < o^+i < Oi < go. Then qi < p is 
not possible, since by the non-overlap assumption gi would be entirely 
above p, hence above Oi as well, violating standardness of S. Hence, qt 
is parallel to or below I (at p). By another appeal to the Parallel Moves 
lemma the fork can be joined via — >-p ti+i e||— ti, where fc > 0 by 
non-erasingness of Si — >■ U (f). The invariant is maintained by setting 
Oi+i to qi if qi < Oi, and to Oi otherwise. 

If T is infinite we are done. If T is finite, it has a final object, say tn, and 
a fork Sji+i <—qi,Si Sn — tn such that the redexes have overlap (|). By 
the orthogonality assumption we must have qn = p and ttn = Q, hence 
= tn. By concatenating T and the infinite reduction S from Sji+i, the 
claim (s' G cx)) is then proven. From the claim, we may repeat the process 
with an infinite standard reduction from s' and s' ^ t. 

Observe that the (context)-case is the only case producing a rewrite step from 
t, but it must eventually always apply since the other case decreases /c by 1. □ 

By replacing the appeal to orthogonality by an appeal to biclosedness in the 
proof of F]TP, i.e. by replacing s„+i = by s„+i «- we get: 

Theorem 4. Non-erasing steps are perpetual in biclosed TRSs. 

Corollary 6. Non-erasing biclosed TRSs are uniformly normalising. 

Note that we are beyond orthogonality since biclosed TRSs need not be conflu- 
ent. The example is as for strongly closed TRSs nm p. 814], but note that the 
latter need not be uniformly normalising! Next, we show |IS1 Lem. 8.11.3.2]. 
Definition 9. A step C[N] — )> C[r"'] is oo-erasing, if it erases all oo-variables, 
that is, if X € Var(r) then x" G SN. 

Theorem 5. Non-oo- erasing rewrite steps are perpetual in biclosed TRSs. 

Proof. Replace in the proof of Thm. 0 everywhere non-erasingness by non-cx)- 
erasingness. The only thing which fails is the statement resulting from (f): 

— By another appeal to the Parallel Moves Lemma the fork can be joined via 
Si+i — >-p ti+i g]]— ^ ti, where fc > 0 by non-cx)-erasingness of Si — >■ ti. 
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We split this case into two new ones depending on whether some argument 
(instance of variable) to Z is cx) or not. 

— In the former case, ti G oo follows directly from non-cx)-erasingness. 

— In the latter case, Si — >■ may take place in an erased argument, and 

Si+i — ti+i = ti- But since all arguments to I are SN, this can happen only 
finitely often and eventually the first case applies. □ 

In 0 a uniform normalisation result not requiring left-linearity, but having a 
critical pair condition incomparable to biclosedness was presented. 

4 Second-Order Term Rewriting 

In this section, the fundamental theorem of perpetuality for second-order term 
rewrite systems is established, by generalising the method of Section 0 

Theorem 6 (F 2 TP). Non-erasing steps are perpetual in orthogonal P 2 RSS. 
Corollary 7. Non-erasing orthogonal are uniformly normalising. 

For ERSs and CRSs these results can be found as IT^ Thm. 60] and 1141 
Cor. II. 5. 9. 4], respectively. The reader is assumed to be familiar with second- 
order term rewrite systems be it in the form of combinatory reduction systems 
(CRSs Pl]), expression reduction systems (ERSs P3|), or higher-order pattern 
rewrite systems (PRSs ^3)- We employ PRSs as defined in H3, but will write 
x.s instead of Xx.s, thereby freeing the A for usage as a function symbol. 

Definition 10. — The order of a rewrite rule is the maximal order of the free 

variables in it. The order of a PRS is the maximal order of the rules in it. 
PnRS abbreviates n^^-order PRS. 

— A rule I ^ r is fully-extended (FE) if for every occurrence Zfti, . . . ,t„) in 
I of a free variable Z, ti, . . . , is the list of variables bound above it. 

— A rewrite step s = C[N] — >■ = t is non-erasing if every symbol from C 

and a in s descends Sect. 3.1.1] to some symbol in 

The adaptation is non-trivial since uniform normalisation may fail for orthogo- 
nal, but third-order or non- left-linear or non- fully-extended systems. 



Table 1. Three counterexamples against uniform normalisation of PRSs 



third-order 

{Xz.M{z))N M{N) 
fxy.Z{u.x{u),y) — >• Z{u.c, fl) 



non- fully-extended 
M{z){z:=N) M(N) 
gxy.Z{y) Z(a) 
e(x,y) -)■ c 
f(a) f(a) 



non-left-linear 
M{x){x:=N) M{N) 
g{x.Z{x),x.Z{x)) — >■ Z{a) 
e{x) — ^ c 
f {a) f{a) 



A TRS step is non-erasing in this sense iff it is non-erasing in the sense of Def. 0 
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Example 2. (third-order) jT^ Ex. 7.1] Consider the 3'''^ -order PRS in Tab.d 
It is the standard PRS-presentation of the A/3-calculus m extended by a 
rule. @ : o— >-o— >-o and A : (o— )>o)— >-o are the function symbols and M : o— lo 
and N : o are the free-variables of the first (/3-)rule. We have made @ an 
implicit binary infix operation and have written Xx.s for A(x.s), for the A- 
calculus to take a more familiar form. If f2 abbreviates (Ax.xx)(Ax.xx), the 
step f xy .{Au.x{u))y fxy.x{y) is non-erasing but critical, 

(non-fully-extended) |E| Ex. 5.9] Consider the non-FE I 2 RS in Tab.d The 
step gxy.e{z,x){z := f{y)) — >■ gxy.e{f{y),x) is non-erasing but critical, 
(non-left-linear) Consider the non-left-linear I 2 RS in Tab.Ql The rewrite step 
g{y.e{x){x:=f{y)),y.c{x:=f{y))) g{y.e{f{y)),y.c{x:=f{y))) from s to t is 

non-erasing but critical; t is terminating, but we have the infinite reduction 

s g{y.c{x := f{y)),y.c{x := f{y))) c{x := f{a)) . 

In each item, the second rule causes failure of uniform normalisation. 

Hence, for uniform normalisation to hold some restrictions need to be imposed: 
We assume PRSs to be left-linear and fully-extended I 2 RSS. For TRSs the fully- 
extendedness condition is vacuous, hence the assumption reduces to left-linearity 
as in Sect. El The restriction to I^RSs entails no restriction w.r.t. the other 
formats, since both CRSs and ERSs can be embedded into I 2 RSS, by coding 
metavariables in rules as free variables of type o— > . . . — > 0— >0 1231 . To adapt the 
proof of F]TP to I 2 RSS, we review its two main ingredients. The first one was a 
notion of simultaneous reduction, extending one-step reduction such that: 

— The residual of a non-erasing step after a context-step is non-erasing. 

The second ingredient was STD. It guarantees the following property: 

— Any redex pattern I which is entirely above a contracted redex is external 
to the reduction S; in particular, I cannot be replicated along S, it can only 
be eliminated by contraction of an overlapping redex in S. 

Since the residual of a parallel reduction after a step above it is usually not 
parallel, we switch from -]]->■ to -e->-, where the latter is the (one-rule restriction 
of the) simultaneous reduction relation of ^2 Def. 3.4]. The context-part of such 
a -e->.-step is the part above or parallel to all occurrences of 1. 

Definition 11. Let g \ I ^ r be a rewrite rule. Write s t if it holds that 
s = • • ■ ) ^'^'“] (ind t = C'[r'^b . . . , r'^'"], where at -e^g n for all 1 < i < k. 

Lemma 3 (Finiteness of Developments). (FD fWl Thm. 3.1.45]) Lets 
t by simultaneously eontraeting redexes at positions in P. Repeated eontraetion 
of residuals of redexes in P starting from s terminates and ends in t. 

The second lemma on -e-;> is a close relative of ^31 Lem. 5.1] and establishes the 
first ingredient above. It fails for I 3 RSS as witnessed by the first item of Ex. 0 
Lemma 4 (Parallel Moves). Let g : I ^ r and d : g ^ d be PRS rules, with 
d seeond-order. Lf s' -(—.g s -e^g t is a fork such that g is in the eontext-part of 
the non-erasing simultaneous step, then the fork is joinable into s' -o^g t' .<—,3 t, 
with the simultaneous step non-erasing. 
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Proof. Joinability follows by FD. It remains to show non-erasingness. t? being 
of order 2, each free variable Z occurs in g as Z{xi, . . . ,Xn) with Xi : o and 
Z : o— >■ . . . — >- 0 — )>o and in d as Z{ti, . . . , with ti : o. Hence, the residuals in 
s' of redexes of s -e->- t are first-order substitution instances of them. Then, to 
show preservation of non-erasingness it suffices to show that Var{s) C Var(s°’) 
for any first-order substitution a, which follows by induction on s. □ 

Left-linearity and fully-extendedness are sufficient conditions for STD to hold. 

Theorem 7 (STD). Any reduction in a P 2 RS can be transformed into a stan- 
dard one. The transformation preserves infiniteness. 

Proof. The proof of the second part of the theorem is as for TRSs. For a proof of 
the first part for left-linear fully-extended (orthogonal) CRSs see ITHl Sect. 7.7.3] 
ll2till. By the correspondence between CRSs and I 2 RSS this suffices for our 
purposes. (STD even holds for PRSs Cor. 1.5].) □ 

Proof. (ofThm.EI) Replace in the proof of Thm.0 everywhere -]]->■ by -e->-. That 
the (context)-case eventually applies follows by an appeal to FD. □ 

The proofs of the results below are obtained by analogous modifications. 

Theorem 8. Non-erasing rewrite steps are perpetual in biclosed P 2 RSS. 

F 2 TP can be strengthened in various ways. Unlike for TRSs, a critical step in a 
I 2 RS need not erase a term in 00 as witnessed by e{f{x))(x:=a) — >■ c{x:=a) in the 
PRS {M{x){x := N) — >• M{N),e{Z) — >• c,f{a) — >■ /(a)}. Note that f{x) G SN, 
but by contracting the _)-redex a is substituted for x and /(a) G 00 . 

Definition 12. An occurrence of (the head symbol of) a subterm is potentially 
infinite if some descendant m of it along some reduction is in 00 . A step is 
00 -erasing if it erases all potentially infinite subterms in its arguments. 

For TRSs this notion of 00 -erasingness coincides with the one of Def. 0 

Corollary 8. Non- 00 - erasing rewrite steps are perpetual in biclosed F^RSs. 

Many variations of this result are possible. We mention two. First, the motivation 
for this paper originates with C3 Sect. 6.4], where we failed to obtain: 

Theorem 9. X-SK-calculus is uniformly normalising. 

Proof. By Cor. 0 since A-diy-calculus is weakly orthogonal. □ 

Second, we show that non-fully-extended I^RSs may have uniform normalisation. 
By the same method, I^RSs where non-fully-extended steps are terminating and 
postponable have uniform normalisation. 

Theorem 10. Non- 00 - erasing steps are perpetual in \j3rj-calculus 1^41 Prop. 27]. 

Proof. It suffices to remark that ry-steps can be postponed after /3-steps in a 
standard reduction j2J Cor. 15.1.6]. Since g is terminating, an infinite standard 
reduction must contain infinitely many /3-steps, hence may be assumed to consist 
of /3’s only and the proof of F 2 TP goes through unchanged. □ 
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5 Ax 

In this section familiarity with the nameful A-calculus with explicit substitutions 
Ax“ of 0 is assumed. We define it as a I 2 RS and establish the fundamental 
theorem of perpetuality for Ax“: 

Theorem 11 (F^TP). Non-erasing steps are perpetual in Xx~ . 

Definition 13. The alphabet of Ax“ ^ consists of the function symbols @ : 
o— >-0— >-0, A : (o— >-o)— )>o and _(_ := _) : (0^0)— )>o— lo. As above, we make @ an 
implicit infix operator associating to the left. The rules of Xx~ are (for x ^ y): 



{Xx.M{x))N 


~^Beta 


M{x){x 


= N) 


x{x 


= N) 




N 




y{x 


= N) 




y 




{Xy.M{y,x)){x 


= N) 




Xy.M{y, 


II 


{M{x)L{x)){x 


= N) 




M{x){x 


= N)L{x){x 



The last four rules are the explicit substitution rules denoted x, generating — >-x. 

— >-x is a terminating and orthogonal l^RS, hence the normal form of a term s 
exists uniquely and is denoted by sj^x- Note that is a pure A-term, i.e. it does 
not contain closures (_(_:= _)-symbols). Ax“ implements (only) substitution 

Lemma 5. 1. If s =x t, then s4_x = ^ix- 

2. If s -J>Beta t, then -e^i 3 4^. 

3. If s is pure and s t, then s — >-Beta • 

Remark that in the second item the number of / 3 -steps might be zero, but is 
always positive when the Beta-step is not inside a closure. We call Ax“ -reductions 
without steps inside closures pretty. Ax“ preserves strong normalisation in the 
sense that any pure term which is / 3 -terminating is Ax“-terminating. 

Lemma 6 (PSN). ^ Thm. 4 . 19] If s is pure and /3-SN, then s is Ax“-SN. 

Proof. Suppose s G 00 . Since Ax“ is a fully-extended left-linear sub- RR^, we 
may by STD assume an infinite standard reduction S' : sq si — >■ . . . from 
s = sq. We show that we may choose S to be pretty decent, where a reduction 
is decent ^ Def. 4.16] if for every closure {x := t) in any term, t G SN. 

(init) s is decent since it is pure. 

(step) Suppose Si G 00 and Si is decent. From the shape of the rules we 
have that ‘brackets are king’ [I I Dipl : if any step takes place in t inside some 
closure {x := t) in a standard reduction, then no step above the closure can 
be performed later in the reduction. This entails that if t is terminating, S 
need not perform any step inside t. Hence assume Si — >■ is pretty. 

^ It only is a since the y in the — ^-^^-rule ranges over variables not over terms. 

® Thinking of terms as trees representing hierarchies of people, creating a redex above 
(overruling) someone (the ruler) from below (the people) is a revolution. For clo- 
sures/brackets this is not possible, whence these are king. 
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(Beta) Suppose Si — >-Beta Si+i contracting {Xx.M{x))N to M{x){x := N). 
We may assume that N is terminating since otherwise we could instead 
perform an infinite reduction on N itself, hence the reduct is decent. 

(x) Otherwise, decency is preserved, since x-steps do not create closures. 

Since x is terminating S must contain infinitely many Beta-steps. Since S is 
pretty S].^ is an infinite /3-reduction from s by (the remark after) Lem. 0 □ 

Our method relates to closure-tracking 0 as preventing to curing. Trying to 
apply it to prove ^ Conj. 6.45], stating that explicification of redex preserving 
CRSs is PSN, led to the following counterexample. 

Example 3. Consider the term s = {\{x.b))a in the I 2 RS TZ with rewrite rules 
{(\x.M{x))N — >• M{g{N,N)),a — >■ b,g{a,b) — >• g{a,b)}. On the one hand s is 
terminating, since s — > b[x:=g{a,a)] = b. On the other hand, explicifying TZ will 
make s infinite, since g{a, a) — >■ g(a, b) — >■ g{a, b). The PRS is redex preserving in 
the sense of ^ Def. 6.44] since any redex in the argument g{N, N) to M occurs 
in N already. So s is a term for which PSN does not hold. 

We expect the conjecture to hold for orthogonal CRSs. For our purpose, uni- 
form normalisation, we will need the following corollary to Lem. 0 on preser- 
vation of infinity. It is useful in situations where terms are only the same up 
to the Substitution Lemma 0 Lem. 2.1.16]: M{x,y){x := N{y)){y := L)l^ = 
M{x,y){y := P){x := N {y := L))i^. 

Corollary 9. If s is decent and then s € oc implies t € 00 . 

How should non-erasingness be defined for Ax“? The naive attempt falters. 

Example 4- From the term s = {{\x.z){yuj)){y := w), where u> = Xx.xx, we have 
a unique terminating reduction starting with a ‘non-erasing’ Beta-step: 

S -tBeta z{x := yuj) {y ■■= Uj) z{y := uj) -)-x 2 

On the other hand, developing {y := w) yields the term ujoj € oc. 

Translating the example into A/3-calculus shows that the culprit is the ‘non- 
erasing’ Beta-step, which translates into an erasing /3-step. Therefore: 

Definition 14. A Xx~ -step contracting redex s to t is erasing if s ^ t is 

{Xx.M{x))N — >-Beta M, with X ^ Var{M{x)f^, or 
y{x := N) y 

Proof, (of Thm. Since Ax“ is a sub-I^RS, it suffices by the proof of F 2 TP 
to consider perpetuality of a step s ~^p^g t, for some infinite standard reduction 
S : So ~^qoAo ■ ■ ■ starting from s = sq such that si ^ s — >■ t is an 

overlapping fork (case (|) on p. II Xx~ has only one non-trivial critical pair. 
It arises by @ and Beta from s' = {{Xx.M{x,y))N{y)){y := P), so let s = C[s']. 
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(Beta,@) In case s — >-Beta C[M{x, y){x := N{y)){y := P)] = si, we note that 

s C[{Xx.M{x,y)){y:=P)N{y){y:=P)]=t 

-)>A C[{Xx.M{x,y){y := P))N{y){y := P)] 

-)>Beta C[M{x,y){y := P){x := N{y){y := P))] = h 

Consider a minimal closure in si (or si itself) which is decent and oo, say 
at position o. If o is parallel or properly below p, i.e. inside one of M{x,y), 
N{y) or P, then obviously ti G oo. Otherwise, o is above p and G oo 
follows from Corollary 0 since si|o4-x = ^iloix- 

(@,Beta) The case s — >-p,Beta C[M{x,y){x := N{y)){y := P)] = t is more 
involved. Construct a maximal reduction T as follows. Let to = ^ be the first 
term of T and set oq = p. 

— Suppose Si Si+i does not contract a redex below Oi. As an invari- 

ant we will use that Oi traces the position of @ (initially at p) along S. 
If Qi is parallel to Oi, then we set ti ti+i- Otherwise qt < Oi and 

by standardness this is only possible in case of an @-step distributing 
closures over the @ at o^. Then we set = ti and o^+i = qi. 

If this process continues, then T is infinite since in case no steps are generated 
Oi+i < Oi, hence eventually a step must be generated. If the process stops, 
say at n, then by construction = D[u]o„ and = D[v]o^, with u = 
{Xx.M{x,y)){y := P)N{y){y := P), v = M{x,y){x := N{y)){y := P) and 
{y := P) abbreviates a sequence of closures the first of which is {y := P). 
Per construction, o„ < q^ for the step ~^q^ and we are in the ‘non- 
replicating’ case: by standardness the @ cannot be replicated along S and it 
can only be eliminated as part of a Beta-step. Consider a maximal part of S 
not contracting o„. Remark that if any of M{x,y), N{y) and P is infinite, 
then tn G oo, so we assume them terminating. 

(context) If infinitely many steps parallel to Oi take place, then D G oo, 
hence tn = D[v] G oo. 

(left) Suppose infinitely many steps are in {Xx.M{x, y)){y:=P). This implies 
M{x, y){y := P) G oo, hence M{x, y){y := P){x := N{y){y := P)) G oo, 
which by Corollary 0 implies G oo. 

(right) Suppose infinitely many steps are in N(y){y := P). By non-erasing- 
ness of s — >-Beta t, X € M{x, y)iy- hence 

u ^x M {x, y)i,,{x := N{y)){y := P) 

= E[x,...,x]{x:=N{y)){y:^P) 

^x E*[x{x := N{y)){y := P), ...,x{x:= N{y)){y := P)] 
E*[N{y){y:=P),...,N{y){y:=P)] G oo 

where E* arises by pushing {x := N {y)) {y := P) through E, and E[, . . . ,] 
is a pure A-calculus context with at least one hole. Hence t = D[v] G oo. 
(Beta) Suppose On is Beta-reduced sometime in S. By standardness steps 
before Beta can be neither in occurrences of the closures {y := P) nor in 
M{x,y), hence we may assume S proceeds as: 
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Sn D[{\x.M{x,y){y.= P))N{y){y.= P)] 

-)>Beta D[M{x, y){y := P)(x := N{y){y := P))] = u' 

We proceed as in item (Beta,@), using u' = t;4.x to conclude v G oo 
by Corollary El The only exception to this is an infinite reduction from 
^{y){y •= P)-i but such a reduction can be simulated from v by non- 
erasingness of the Beta-step as in item (right) . □ 

The proof is structured as before, only di/polluted by explicit substitutions trav- 
elling through the pivotal Beta-redex. Again, one can vary on these results. For 
example, it should not be difficult to show that non-cx)-erasing steps are perpet- 
ual, where y{x:=N) — y is cx)-erasing if N G oo and {Xx.M{x))N — >-Beta M is 
cx)-erasing if a; ^ Var(x(M(x))) and N contains a potentially infinite subterm. 

6 Conclusion 

The uniform normalisation proofs in literature are mostly based on particu- 
lar perpetual strategies, that is, strategies performing only perpetual steps. 
Observing that the non-computable0 such strategies usually yield standard 
reductions we have based our proof on standardisation, instead of searching 
for yet another ‘improved’ perpetual strategy. This effort was successful and 
resulted in a flexible proof strategy with a simple invariant easily adaptable to 
a A-calculus with explicit substitutions. Nevertheless, our results are still very 
much orthogonality-bound: the biclosedness results arise by tweaking orthogo- 
nality and the Ax“ results by interpretation in the, orthogonal, A/3-calculus. It 
would be interesting to see what can be done for truly non-orthogonal systems. 
The fully-extendedness and left-linearity restrictions are serious ones, e.g. in the 
area of process-calculi (scope extrusion) or even already for Ax ^ , so should be 
ameliorated. 
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Abstract. We consider two decision problems related to the Knuth- 
Bendix order (KBO). The first problem is orientability: given a system 
of rewrite rules R, does there exist some KBO which orients every ground 
instance of every rewrite rule in R. The second problem is whether a given 
KBO orients a rewrite rule. This problem can also be reformulated as the 
problem of solving a single ordering constraint for the KBO. We prove 
that both problems can be solved in polynomial time. The algorithm 
builds upon an algorithm for solving systems of homogeneous linear in- 
equalities over integers. Also we show that if a system is orientable using 
a real-valued KBO, then it is also orientable using an integer-valued 
KBO. 

1 Introduction 

In this section we give an informal overview of the results proved in this paper. 
The formal definitions will be given in the next section. 

Let be any order on ground terms and Z — ^ r be a rewrite rule. We say that 
orients / — > r, if for every ground instance /' — >■ r' of Z — ^ r we have I' >- r' . 
We write I > r \i either I r or I = r. There are situations where we want to 
check if there exists a simplification order on ground terms that orients a given 
system of (possible nonground) rewrite rules. We call this problem orientability. 
Orientability can be useful when a theorem prover is run on a new problem for 
which no suitable simplification order is known, or when termination of a rewrite 
system is to be established automatically. For a recent survey, see 0. We consider 
the orientability problem for the Knuth-Bendix orders (in the sequel KBO) |Z] 
on ground terms. We give a polynomial-time algorithm for checking orientability 
by the KBO. A similar problem of orientability by the nonground version of the 
real-valued KBO was studied in ^ and an algorithm for orientability was given. 
We prove that any rewrite rule system orientable by a real-valued KBO is also 
orientable by an integer-valued KBO. This result also holds for the nonground 
version of the KBO considered in p|. In our proofs we use some techniques of 
0 . We also show that some rewrite systems could not be oriented by nonground 
version of the KBO, but can be oriented by our algorithm. 

The second problem we consider is solving ordering constraints consisting of 
a single inequality, over the Knuth-Bendix order. If is total on ground terms, 
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then the problem of checking if orients I — > r has relation to the problem 
of solving ordering constraints over Indeed, does not orient Z — > r if and 
only if there exists a ground instance Z' — ?> r' of Z — )> r such that r' ^ Z', i.e., if 
and only if the ordering constraint r ^ Z has a solution. This means that any 
procedure for solving ordering constraint consisting of a single inequality can 
be used for checking whether a given system of rewrite rules is oriented by 
and vice versa. Using the same technique as for the orientability problem, we 
show that the problem of solving an ordering constraint consisting of a single 
inequality for the KBO can be solved in polynomial time. 

Algorithms for, and complexity of, orientability for various version of the 
recursive path ordering were considered in lEEini. The problems of solving 
ordering constraints for lexicographic and recursive path orderings and for KBO 
are NP-complete However, to check if orients Z — >■ r, it is sufficient 

to check solvability of a single ordering constraint r > 1. This problem is NP- 
complete for LPO 0, and therefore the problem of checking if an LPO orients 
a rewrite rule is coNP-complete. 

2 Preliminaries 

A signature is a finite set of function symbols with associated arities. In this 
paper S denotes an arbitrary signature. Constants are function symbols of the 
arity 0. We assume that E contains at least one constant. We denote variables by 
X, y, z, constants by a, 6, c, d, e, function symbols by /, g, Zi, and terms by Z, r, s, t. 
Systems of rewrite rules and rewrite rules are defined as usual, see e.g. 0. An 
expression E (e.g. a term or a rewrite rule) is called ground if no variable occurs 
in E. Denote the set of natural numbers by N. 

We call a weight funetion on E any function re : TZ — > N such that (i) 
w{a) > 0 for every constant a G E, (ii) there exist at most one unary function 
symbol f G E such that w{f) = 0. Given a weight function w, we call w{g) the 
weight of g. The weight of any ground term t, denoted |Z|, is defined as follows: 
for every constant c we have |c| = w{c) and for every function symbol g of a 
positive arity \g{ti , . .. ,t„)| = w{g) + \ti\ + . . . + \tr,\. 

A preeedence relation on E is any total order ^ on if. A precedence relation 
3> is said to be eompatible with a weight function w if for every unary function 
symbol /, if w{f) = 0, then / is the greatest element w.r.t. 

Let w be a weight function on E and ^ a precedence relation on E compat- 
ible with w. The Knuth-Bendix order induced by (rc,^) is the binary relation 
on the set of ground terms of E defined as follows. For all ground terms 
t = g{ti , . . . , tn) and s = h{si , . . . , Sk) we have t >- s if one of the following 
conditions holds: 

1. \t\ > |s|; 

2. \t\ = |s| and g h; 

3. |t| = \s\, g = h and for some 1 < i < n we have ti = si, . . . , U_i = Si-i and 

ti >~ Si- 



Verifying Orientability of Rewrite Rules Using the Knuth-Bendix Order 



139 



The compatibility condition ensures that the Knuth-Bendix order is a simplifi- 
cation order total on ground terms. 

In the sequel we will often refer to the least and the greatest terms among 
the terms of the minimal weight for a given KBO. It is easy to see that every 
term of the minimal weight is either a constant of the minimal weight, or a term 
/”(c), where c is a constant of the minimal weight, and w{f) = 0. Therefore, the 
least term of the minimal weight is always the constant of the minimal weight 
which is the least among all such constants w.r.t. This constant is also the 
least term w.r.t. 

The greatest term of the minimal weight exists if and only if there is no unary 
function symbol of the weight 0. In this case, this term is the constant of the 
minimal weight which is the greatest among such constants w.r.t. 

Definition 1 (substitution) A substitution is a mapping from a set of variables 
to the set of terms. A substitution 9 is grounding for an expression E (i.e., term, 
rewrite rule etc.) if for every variable x occurring in E the term 9(x) is ground. 
We denote by E9 the expression obtained from E by replacing in it every variable 
X by 9(x). A ground instance of an expression E is any expression E9 which is 
ground. 

The following definition is central to this paper. 

Definition 2 (orientability) A KBO orients a rewrite rule Z — >■ r if for every 
ground instance I' ^ r' oi I ^ r we have I' r' . A KBO orients a system R of 
rewrite rules if it orients every rewrite rule in R. 

Note that we define orientability in terms of ground instances of rewrite 
rules. One can also define orientability using the nonground version of the KBO 
as originally defined in [Z]. But then we obtain a weaker notion (fewer systems 
can be oriented) as the following example from shows. 

Example 1. Consider the rewrite rule g{x,a,b) — > g(6, 6, a). For any choice of 
the weight function w and order g{x,a,b) >- g{b,b,a) does not hold for the 
original Knuth-Bendix order with variables. However, this rewrite rule can be 
oriented by any KBO such that w{a) > w{b) and b. 

In fact the order based on all ground instances is the greatest simplification order 
extending the ground KBO to nonground terms. 

3 Systems of Homogeneous Linear Inequalities 

In our proofs and in the algorithm we will use several properties of homogeneous 
linear inequalities. The definitions related to systems of linear inequalities can 
be found in standard textbooks (e.g., uni). We will denote column vectors of 
variables by A, integer or real vectors by V, IT, integer or real matrices by A, B. 
Column vectors consisting of O’s will be denoted by 0 . The set of real numbers 
is denoted by M, and the set of nonnegative real numbers by M+. 
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Definition 3 (homogeneous linear inequalities) A homogeneous linear in- 
equality has the form either VX > 0 or VX > 0. A system of homogeneous 
linear inequalities is a finite set of homogeneous linear inequalities. 

Solutions (real or integer) to systems of homogeneous linear inequalities are 
defined as usual. 

We will use the following fundamental property of system of homogeneous 
linear inequalities: 

Lemma 1. Let AX > 0 be a system of homogeneous linear inequalities, where 
A is an integer matrix. Then there exists a finite number of integer vectors 
V\, . . . ,Vn such that the set of solutions to AX > 0 is 

{riVi + . . . + rnVn I ri, . . . ,r„ e R+}. (1) 

The proof can be found in e.g., m- 

The following lemma was proved in for the systems of linear homogeneous 
inequalities over the real numbers. We will give a simpler proof of it here. 

Lemma 2. Let AX > 0 be a system of homogeneous linear inequalities where 
A is an integer matrix and Sol be the set of all real solutions to the system. Then 
the system can be split into two disjoint subsystems BX > 0 and CX > 0 such 
that 

1. BV = 0 for every V G Sol. 

2. If C is nonempty then there exists a solution V G Sol such that CV > 0. 

Proof. By Lemma ^ we can find integer vectors V\, . . . ,Vn such that the set 
Sol is 0. We define BX > 0 to be the system consisting of all inequalities in 
WX > 0 in the system such that WVi = 0 for alH = 1, . . . , n; then property ^ 
is obvious. 

Note that the system CX > 0 consists of the inequalities WX > 0 such that 
for some Vi we have WVi > 0. Take V to be Vi + . . . + Ki, then it is not hard to 
argue that CV > 0. Let D be a system of homogeneous linear inequalities with 
a real matrix. We will call the subsystem BX > 0 of D the degenerate subsystem 
if the following holds. Denote by C be the matrix of the complement to BX > 0 
in ID) and by Sol be the set of all real solutions to D. Then 

1. BV — 0 for every V G Sol. 

2. If C is nonempty then there exists a solution V G Sol such that CV > 0. 

For every system ID) of homogeneous linear inequalities the degenerate subsystem 
of ID) will be denoted by ID)= . Note that the degenerate subsystem is defined for 
arbitrary systems, not only those of the form AX > 0. 

Let us now prove another key property of integer systems of homogeneous 
linear inequalities: the existence of a real solution implies the existence of an 
integer solution. 



Verifying Orientability of Rewrite Rules Using the Knuth-Bendix Order 



141 



Lemma 3. Let D &e a system of homogeneous linear inequalities with an integer 
matrix. Let V he a real solution to this system and for some subsystem ofD with 
the matrix B we have BV > 0. Then there exists an integer solution V to D 
for which we also have BV' > 0. 

Proof. Let D' be obtained from D by replacement of all strict equalities WX > 
0 by their nonstrict versions WX >0. Take vectors Vi, ... ,Vn so that the set of 
solutions to D' is dU- Evidently, for every inequality WX >Q in BV > 0 there 
exists some Vi such that WVi > 0. Define V as V\ + . . . + Vn, then it is not 
hard to argue that BV' > 0. We claim that V is a solution to D. Assume the 
converse, then there exists an equation WX > 0 in D such that WV = 0. But 
WV = 0 implies that WVi = 0 for all i, so W X > 0 cannot belong to D'. 

The following Lemma follows from Lemmas 0 and 0 

Lemma 4. Let D be a system of homogeneous linear inequalities with an integer 
matrix and its degenerate subsystem is different from D. Let B be the matrix 
of the complement of the degenerate subsystem. Then there exists an integer 
solution U to D such that BV > 0. 

The following result is well-known, see e.g., M- 

Lemma 5. The existence of a real solution to a system of linear inequalities 
can be decided in polynomial time. 

This lemma and Lemma 0 imply the following key result. 

Lemma 6. (i) The existence of an integer solution to an integer system of ho- 
mogeneous linear inequalities can he decided in polynomial time, (ii) If an integer 
system D of homogeneous linear inequalities has a solution, then its degenerate 
subsystem can be found in polynomial time. 

4 States 

In Section 0 we will present an algorithm for ground orientability by the Knuth- 
Bendix order. This algorithm will work on states which generalize systems of 
rewrite rules in several ways. A state will use a generalization of rewrite rules to 
tuples of terms and some information about possible solutions. 

Let be any order on ground terms. We extend it lexicographically to an 
order on tuples of ground terms as follows: we write (Zi, . . . , /„) (ri, . . . , rff) 
if for some z G {1, . . . , n} we have l\ = ri, . . . , Zi_i = and li >- ri. We call 
a tuple inequality any expression (Zi,...,Z„) > (ri,...,r„). The length of this 
tuple inequality is n. 

In the sequel we assume that H is a fixed signature and e is a constant not 
belonging to E. The constant e will play the role of a temporary substitute for 
a constant of the minimal weight. We will present the algorithm for orienting a 
system of rewrite rules as a sequence of state changes. We call a state a tuple 
(ffi., M, D, U, G, L, ^), where 



142 



K. Korovin and A. Voronkov 



1. R is a set of tuple inequalities {li, ... ,ln) > (?'i, • • ■ , such that every two 
different tuple inequalities in this set have disjoint variables. 

2. M is a set of variables. This set denotes the variables ranging over the terms 
of the minimal weight. 

3. ID is a system of homogeneous linear inequalities with variables 
{wg I g G 27 U {e}}. This system denotes constraints on the weight function 
collected so far, and We denotes the minimal weight of terms. 

4. U is one of the following values one or undefined. The value one signals that 
there exists exactly one term of the minimal weight, while undefined means 
that no constraints on the number of elements of the minimal weight have 
been imposed. 

5. G and L are sets of constants, each of them contains at most one element. 
If d G G (respectively d G L), this signals that d is the greatest (respectively 
least) term among the terms of the minimal weight. 

6. 7^ is a binary relation on 27. This relation denotes the subset of the prece- 
dence relation computed so far. 

Let ic be a weight function on 27, 7I>' a precedence relation on 27 compatible 
with w, and the Knuth-Bendix order induced by (w,7g>'). A substitution cr 
grounding for a set of variables X is said to be minimal for X if for every variable 
X € X the term a{x) is of the minimal weight. We extend ic to e by defining 
w(e) to be the minimal weight of a constant of 27. 

We say that the pair (w, 7^') is a solution to a state (R, M, D, U, G, L, 7^) if 

1. The weight function w solves every inequality in ID in the following sense: 
replacement of each Wg by w{g) gives a tautology. 

2. If U = one, then there exists exactly one term of the minimal weight. 

3. If d G G (respectively d G L) for some constant d, then d is the greatest 
(respectively least) term among the terms of the minimal weight. Note that 
if d is the greatest term of the minimal weight, then the signature contains 
no unary function symbol of the weight 0. 

4. For every tuple inequality (li,...,ln) > (?'i, ■ • • > ?"n) in R and every sub- 
stitution cr grounding for this tuple inequality and minimal for M we have 
(Zicr, . . . ,;„cr) (ricr, . . . ,r„a). 

5. 7^' extends 7^. 

We will now show how to reduce the orientability problem for the systems of 
rewrite rules to the solvability problem for states. 

Let i? be a system of rewrite rules such that every two different rules in R 
have disjoint variables. Denote by S^j the state (R, M, D, U, G, L, 7^) defined as 
follows. 

1. R consists of all tuple inequalities (1) > (r) such that I — >■ r belongs to R. 

2. M = 0. 

3. ID consists of (a) all inequalities Wg > 0, where g G 27 is a nonconstant; (b) 
the inequality We > 0 and all inequalities Wd — We > 0, where d is a constant 
of 27. 
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4. U = undefined. 

5. G = L = 0. 

6. is the empty binary relation on S. 



Lemma 7. Let w he a weight function, a precedence relation on S compatible 
with w, and >- the Knuth-Bendix order induced by Then >- orients R if 

and only if is a solution to Sr. 

The proof is straightforward. 

For technical reasons, we will distinguish two kinds of signatures. Essentially, 
our algorithm depends on whether the weights of terms are restricted or not. 
For the so-called nontrivial signatures, the weights are not restricted. When we 
present the orientability algorithm for the nontrivial signatures, we will use the 
fact that terms of sufficiently large weight always exist. For the trivial signatures 
the algorithm is presented in the full version of this paper HD]. 

A signature S is called trivial if it contains no function symbol of arity > 2 
and at most one unary function symbol. Note that S is nontrivial if and only if 
it contains either a function symbol of arity > 2 or at least two function symbols 
of arity 1. The proof of the following lemma can be found in m- 

Lemma 8. Let S he a nontrivial signature and w be a weight function for S. 
Then for every integer m there exists a ground term of the signature S such that 
|t| > m. 



5 An Algorithm for Orientability in the Case of 
Nontrivial Signatures 



In this section we only consider nontrivial signatures. An algorithm for trivial 
signatures is given in m- The algorithm given in this section will be illustrated 
below in Section 15.51 on rewrite rule of Example Q 

Our algorithm works as follows. Given a system of rewrite rules R, 
we build the initial state = (M, M, D, U, G, L, Then we transform 
(R, M, ID), U, G, L, repeatedly as described below. We call the size of the state 
the total number of occurrences of function symbols and variables in R. Every 
transformation step will terminate with either success or failure, or else decrease 
the size of R. 

At each step we assume that R consists of k tuple inequalities 



{h,Li) > (ri, i?i), 

{Ik^ d-'k) ^ {rkj Rk) ^ 



( 2 ) 



such that all of the Li,Ri are tuples of terms. We denote by the state 
obtained from S by removal of the ith tuple inequality {U, Li) > (r^, Ri) from R. 

We will label parts of the algorithm, these labels will be used in the proof of 
its soundness. The algorithm can make a nondeterministic choice, but at most 
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once, and the number of nondeterministic branches is bounded by the number 
of constants in E. 

When the set D of linear inequalities changes, we assume that we check the 
new set for satisfiability, and terminate with failure if it is unsatisfiable. Likewise, 
when we change 1^, we check if it can be extended to an order and terminate 
with failure if it cannot. 



5.1 The Algorithm 

The algorithm works as follows. Every step consists of a number of state trans- 
formations, beginning with PREPROCESS defined below. During the algorithm, 
we will perform two kinds of consistency checks: 

— The consistency check on D is the check if D has a solution. If it does not, 
we terminate with failure. 

— The consistency check on ^ is the check if ^ can be extended to an order, 
i.e., the transitive closure of is irreflexive, i.e., for no g € S we have 
g g. If ^ cannot be extended to an order, we terminate with failure. 

It is not hard to argue that both kinds of consistency checks can be performed 
in polynomial time. The consistency check on D is polynomial by Lemma 0 The 
consistency check on is polynomial since the transitive closure of a binary 
relation can be computed in polynomial time. 

PREPROCESS. Do the following transformations while possible. If any tuple 
inequality in R. has length 0, remove it from R. Otherwise, if R contains a 
tuple inequality (^i,...,Z„) > (Zi,...,Z„), terminate with failure. Otherwise, 
if R contains a tuple inequality {l,li, . . . ,ln) > ■ ■ ■ ^fn), replace it by 

^ {j"!-! • • • i‘^n) • 

If R becomes empty, proceed to TERMINATE, otherwise continue with MAIN. 

MAIN. Now we can assume that in each li is a term different from the 
corresponding term Vi. For every variable x and term t denote by n{x,t) the 
number of occurrences of x in t. For example, n{x,g{x^h{y,x))) = 2. Likewise, 
for every function symbol g G E and term t denote by n{g, t) the number of 
occurrences of g in t. For example, n{h, g(x,h{y,x))) = 1. 

(Ml) For all X and i such that n{x, k) > n{x, r,), add x to M. 

(M2) If for some i there exists a variable a; ^ M such that n{x,li) < n{x,Ti), 
then terminate with failure. 

For every pair of terms l,r, denote by W{l,r) the linear inequality obtained 
as follows. Let vi and Vr be the numbers of occurrences of variables in I and r 
respectively. Then 

W (I, r) = ^ (n(g, 1) - n{g, r))wg + (vi - Vr)we > 0. 
ges 



( 3 ) 
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For example, if I = h{x,f{y)) and r = f{g{x,g{x,y))), then 
W{l,r) = Wh — 2 ■ Wg — We > 0. 

(M3) Add to D all the linear inequalities W{li,ri) for all i and perform the 
consistency check on D. 

Now compute D=. IfID= contains none ofthe inequalities W{li,ri), proceed to 
TERMINATE. Otherwise, for all i such that W{li,ri) € apply the applicable 
case below, depending on the form of li and 

(M4) If {k,ri) has the form {g{si, . . . ,Sn),h{ti, . . . ,tp)), where g is different 
from h, then extend by adding g ^ h and remove the tuple inequality 
{k,Li) > {ri,Ri) from R. Perform the consistency check on 
(M5) If {k,ri) has the form {g{si, . . . , Sn),g{ti, . . . , tn)), then replace {k, Li) > 
{xi , Ri) by (si , . . . , Syi , Li) > , . . . , ti 2 , Ri) ■ 

(M6) If (li, Xi) has the form (x, y), where x and y are different variables, do the 
following. If Li is empty, then terminate with failure. Otherwise, set U to one, 
add a; to M and replace {k,Li) > {n,Ri) by {Li) > (Ri). 

(M7) If {li, Ti) has the form (x, t), where t is not a variable, do the following. If t 
is not a constant, or Li is empty, then terminate with failure. So assume that t is 
a constant c. If L = {d} for some d different from c, then terminate with failure. 
Otherwise, set L to {c}. Replace in Li and Ri the variable x by c, obtaining L{ 
and R'i respectively, and then replace {k,Li) > {n,Ri) by (L') > (i?'). 

(M8) If (lijXi) has the form {t,x), where t is not a variable, do the following. If 
t contains x, remove {li,Li) > (ri,Ri) from R. Otherwise, if t is a nonconstant 
or Li is empty, terminate with failure. Let now t be a constant c. If G = {d} for 
some d different from c, then terminate with failure. Otherwise, set G to {c}. 
Replace in Li and Ri the variable x by c, obtaining L{ and R{ respectively, and 
then replace {k,Li) > {xi,Ri) by (L') > (i?'). 

After this step repeat PREPROCESS. 

TERMINATE. Let (R, M,1D>, U, G, L, be the current state. Do the following. 
(Tl) If d G G, then for all constants c different from d such that Wc — We > 0 
belongs to extend by adding d ^ c. Likewise, if c G L, then for all 
constants d different from c such that Wd — Wg > 0 € D~ extend by adding 
d^ c. Perform the consistency check on 

(T2) For all / in A do the following. If / is a unary function symbol and Wf > 0 
belongs to D=, then extend by adding / ^ d for all d G A — {/}. Perform 
the consistency check on If U = one or G yf 0, then terminate with failure. 
(T3) If there exists no constant c such that Wc — rce ^ 0 is in D , nondetermin- 
istically choose a constant c G A, add We — Wc> 0 to D, perform the consistency 
check on D and repeat PREPROCESS. 

(T4) If U = one, then terminate with failure if there exists more than one 
constant c such that Wg — Wg > 0 belongs to ID^. 

(T5) Terminate with success. 

We will show how to build a solution at step (T5) below in Lemma 
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5.2 Correctness 

In this section we prove correctness of the algorithm. In Section tl.dl we show 
how to find a solution when the algorithm terminates with success. The correct- 
ness will follow from a series of lemmas asserting that the transformation steps 
performed by the algorithm preserve the set of solutions. We will use notation 
and terminology of the algorithm. We say that a step of the algorithm is cor- 
rect if the set of solutions to the state before this step coincides with the set 
of solutions after the step. When we prove correctness of a particular step, we 
will always denote by S = (M, M, ID), U, G, L, the state before this step, and 
by §' = (K', M', D', U', G', L', 1^') the state after this step. When we use substi- 
tutions in the proof, we always assume that the substitutions are grounding for 
the relevant terms. 

The following two lemmas can be proved by a straightforward application of 
the definition of solution to a state. 

Lemma 9 (consistency check). If consistency check on ]D> or on ^ terminates 
with failure, then S has no solution. 

Lemma 10. Step PREPROCESS is correct. 

Let us now analyze MAIN. For every weight function w and precedence rela- 
tion compatible with w we call a counterexample to (li,Li) > {ri,Rf) w.r.t. 

(ui,ig>) any substitution a minimal for M such that {ria,Ri(j) ^ {lia,Li<j) for 
the order induced by (w,^). 

The following lemma follows immediately from the definition of solution. 

Lemma 11 (counterexample). If for every solution {w, ^) to there exists 
a counterexample to (li,Li) > {vi,Ri) w.r.t. then S has no solution. If 

for every solution {w, ^) to there exists no counterexample to the tuple 
inequality (li,Lf) > (ri,Ri), then removing this tuple inequality from R does not 
change the set of solutions to S. 

This lemma means that we can change {li, Li) > {ri, Ri) into a different tuple 
inequality or change M, if we can prove that this change does not influence the 
existence of a counterexample. 

Let cr be a substitution, x a variable and t a term. We will denote by cr* the 
substitution defined by 



Lemma 12. Let for some x and i we have n{x,li) > n{x,ri) and there ex- 
ists a counterexample a to {k,Li) > (ri,Ri) w.r.t. (w,^). Then there exists a 
counterexample a' to {h, Li) > (ri,Ri) w.r.t. (w,^) minimal for {x}. 
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Proof. Suppose that cr is not minimal for {a;}. Denote by c a minimal constant 
w.r.t. w and by t the term xa. Since a is not minimal for x, we have |t| > |c|. 
Consider the substitution cr^. Since cr is a counterexample, we have |ricr| > \lia\. 
We have 



\h(T%\ = \k<j\ - n{x,k) ■ {\t\ - |c|); 
Idct^I = |r*cr| - n{x, n) • (|t| - |c|). 



Then 



= \rM - n{x, Ti) • (|t| - |c|) > \ka\ - n{x, Xi) • (|t| - |c|) 

> \ka\ - n{x, k) ■ (|t| - |c|) = 

Therefore, |ricr^| > |?icr°|, and so cr° is a counterexample too. 

One can immediately see that this lemma implies correctness of step (Ml). 

Lemma 13. Step (Ml ) is correct. 

Proof. Evidently, every solution to S is also a solution to §'. But by Lemma IT^ 
every counterexample to S can be turned into a counterexample to S', so every 
solution to S' is also a solution to S. 

Let us now turn to step (M2). 

Lemma 14 (M2). If for some i and x ^M. we have n{x,k) < n{x,ri), then S 
has no solution. Therefore, step (M2) is correct. 

Proof. We show that for every (w,^) there exists a counterexample to 
{h, Li) > (ri,Ri) w.r.t. Let a be any substitution grounding for this 

tuple inequality. Take any term t and consider the substitution cr* . We have 

\r^al,\ - l/iCr* I = |ricr| - \ka\ + (n{x,ri) - n{x,k)) ■ (|t| - |a;cr|). 

By Lemma El there exist terms of an arbitrarily large weight, so for a term t of 
a large enough weight we have |ricr‘ | > |^icr* |, and so cr* is a counterexample to 

{fiiLi) > {ri,Ri). 

Correctness of (M2) is straightforward. 

Note that after step (M2) for all i and a; ^ M we have n{x, k) = n{x, r^). 
Denote by 0c the substitution such that 0c{x) = c for every variable x. 

Lemma 15 (M3). Let for all i and x ^ Wl we have n{x,li) = n{x,ri). Every 
solution to § is also a solution to W{li,ri). Therefore, step (MS) is 

correct. 

Proof. Let c be a constant of the minimal weight. Consider the substitution 
0c. Note that this substitution is minimal for M. It follows from the definition 
of W that (rc,») is a solution to W{k,ri) if and only if \li0c\ > \xi0c\. But 
\h0c\ > \i"i0c\ is a straightforward consequence of the definition of solutions to 
tuple inequalities. 

Correctness of (M3) is straightforward. 



148 



K. Korovin and A. Voronkov 



Lemma 16. Let for all x ^ M. we have n{x,li) = n{x,ri). Let also W{li,ri) G 
D^. Then for every solution to and every substitution a minimal for M we 
have \liO-\ = \ria\. 

Proof. Using the fact that for all a; ^ M we have n{x,k) = n{x,ri), it is not 
hard to argue that \lia\ — \ricr\ does not depend on cr, whenever a is minimal for 

M. 

It follows from the definition of W that if W(li,ri) G D=, then for every 
solution to D (and so for every solution to S“*) we have \kOc\ = \ri&c\ - Therefore, 
\h<x\ = \vi<j\ for all substitutions cr minimal for M. 

The proof of correctness of steps (M4)-(M8) will use this lemma in the fol- 
lowing way. A pair (w, is a solution to S if and only if it is a solution to 
and a solution to {k,Li) > (ri,Ri). Equivalently, (w,;^) it is a solution to 
S if and only if it is a solution to and for every substitution a minimal for 
M we have {lia,Lia) >- {viU, Ria) . But by Lemma El we have \licr\ = |ricr|, so 
{lia,Liu) >- {ri<j,Ria) must be satisfied by either condition |2| or condition 0of 
the definition of the Knuth-Bendix order. 

This consideration can be summarized as follows. 

Lemma 17. Let for all x ^ Wl we have n{x,li) = n{x,Vi). Let also W{li,ri) G 
D^. Then (w,^) is a solution to § if and only if it is a solution to and for 
every substitution a minimal for M the following holds. Let lia = g(fi, . . . ,t„) 
and riu = h{s\, . . . , s„). Then at least one of the following eonditions holds 



1. lia = Via and Lia >- Ria; or 

2. g^ h; or 

3. g = h and for some 1 < i < n we have t\a = sicr, . . . Ri-\a = Si-\a and 

tia Sia . 



Lemma 18. Steps (M4)-(M8) are eorreet. 

Proof. (M4) We know that li = g{si, . . . , s„) and = h(fi, . . . Rp). Take any 
substitution a minimal for M. Obviously, ha = ria is impossible, so {li,Li)a >- 
{ri,Ri)a if and only if ha >- ria. By Lemma [T7I this holds if and only if g ^ h, 
so step (M4) is correct. 

(M5) We know that h = g{si, . ■ . ,Sn) and rj = g{ti, . . . Rn)- Note that 
due to PREPROCESS, h ^ Xi, so n > 1. It follows from Lemma El that 
{h,Li)a >- {ri,Ri)a if and only if (si, . . . , s„, L*)cr {R, . . . Rn, R^)a, so step 
(M5) is correct. 

(M6) We know that h = x and = y, where x,y are different variables. Note 
that if Li is empty, then the substitution 0c, where c is of the minimal weight, 
is a counterexample to (x,Li) > (y,Ri). So assume that Li is nonempty and 
consider two cases. 

1. If there exists at least two terms s, t of the minimal weight, then there exists 
a counterexample to {x^Lf) > (y,Ri). Indeed, if s f, then ya >- xa for 
every a such that a{x) = t and a{y) = s. 
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2. If there exists exactly one term t of the minimal weight, then xa = ya 
for every a minimal for M. Therefore, (x, Li) > (y, Ri) is equivalent to 
(L,) > {R,). 



In either case it is not hard to argue that step (M6) is correct. 

(M7) We know that k = x and = t. Let c be the least constant in the signature. 
If t yf c, then &c is obviously a counterexample to (x,Li) > (t,Ri). Otherwise 
t = c, then for every counterexample cr we have cr(x) = c. In either case it is not 
hard to argue that step (M7) is correct. 

(M8) We know that li = t and = x. Note that t ^ x due to the PREPROCESS 
step, so if X occurs in t we have ta y xa for all cr. Assume now that x does not 
occur in t. Then x G M. Consider two cases. 

1. t is a nonconstant. For every substitution a minimal for M we have |tcr| = 

|xcr|, hence ta is a nonconstant term of the minimal weight. This implies 
that the signature contains a unary function symbol / of the weight 0. Take 
any substitution a. It is not hard to argue that is a counterexample 

to {t,Li) > {x,Ri). 

2. t is a constant c. Let d be the greatest constant in the signature among the 
constants of the minimal weight. If d yf c, then 0d is obviously a counterex- 
ample to (c,Li) > (x,Ri). Otherwise d = c, then for every counterexample 
a we have a{x) = c. 

In either case it is not hard to argue that step (M8) is correct. 

Let us now analyze steps TERMINATE. Note that for every constant c the in- 
equality Wc — We > 0 belongs to D and for every function symbol g the inequality 
Wg>Q belongs to D too. 

Lemma 19. Steps (T1)-(T2) are correct. 

Proof. (T1) Suppose d G G, c d, and Wc — We > 0 belongs to D^. Then for 
every solution to S we have w{c) = w{e), and therefore c is a constant of the 
minimal weight. But since for every solution d is the greatest constant among 
those having the minimal weight, we must have d^ c. The case c G L is similar. 
(T2) If / is a unary function symbol and Wf > 0 belongs to ID^, then for every 
solution w{f) = 0. By the definition of the Knuth-Bendix order we must have 
/ g for all (/ G A — {/}. But then (i) there exists an infinite number of terms 
of the minimal weight and (ii) a constant d G G cannot be the greatest term of 
the minimal weight (since for example /(d) d and |/(d)| = |d|). 

Step (T3) makes a nondeterministic choice, which can replace one state by 
several states §i, . . . , We say that such a step is correct if the set of solutions 
to S is the union of the sets of solutions to §i, . . . , 

Lemma 20. Steps (T3)-(T4) are correct. 

Proof. (T3) Note that w is a solution to We — iCc > 0 if and only if w{c) is the 
minimal weight, so addition of We — iCc > 0 to ID amounts to stating that c has 
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the minimal weight. Evidently, for every solution, there must be a constant c of 
the minimal weight, so the step is correct. 

(T4) Suppose U = one, then for every solution there exists a unique term of the 
minimal weight. If, for a constant c, Wc~ We>Q belongs to D^, then c must be 
a term of the minimal weight. Therefore, there cannot be more than one such a 
constant c. 

5.3 Extracting a Solution 

In this section we will show how to find a solution when the algorithm terminates 
with success. 

Lemma 21. Step (T5) is correct. 

Proof. To prove correctness of (T5) we have to show the existence of solution. 
In fact, we will show how to build a particular solution. 

Note that when we terminate at step (T5), the system D is solvable, since it 
was solvable initially and we performed consistency checks on every change of 

D. 

By Lemma El there exists an integer solution re to D which is also a solution 
to the strict versions of every inequality in D — D^. Likewise, there exists a linear 
order extending since we performed consistency checks on every change 
of We claim that (w, »') is a solution to (K, M, D, U, G, L, To this end 
we have to show that w is weight function, is compatible with w and all 
items um of the definition of solution are satisfied. 

Let us first show that w is a weight function. Note that D contains all in- 
equalities Wg > 0, where g € S is a, nonconstant, the inequality Wg > 0 and the 
inequalities Wd — Wg > Q for all constant d £ S. So to show that w is a weight 
function it remains to show that at most one unary function symbol / has weight 

0. Indeed, if there were two such function symbols /i and / 2 , then at step (T2) 
we would add both fi /2 and /2 /i, but the following consistency check 

on ^ would fail. 

The proof that ^ is compatible with w is similar. 

Denote by the Knuth-Bendix order induced by (w, i^'). 

1. The weight function w solves every inequality in ID. This follows immediately 
from our construction, if we show that w{e) is the minimal weight. Let us 
show that Wg is the minimal weight. Indeed, since D initially contains the 
inequalities Wg — Wg > 0 for all constants c, we have that w{e) is less than or 
equal to the minimal weight. By step (T3), there exists a constant c such that 
Wg — Wg > 0 is in D^, hence w{c) = w(e), and so w(e) is greater than or equal 
to the minimal weight. Evidently, w(e) cannot be greater than the minimal 
weight, since ID contains the inequalities Wg — Wg > 0 for all constants c. 

2. IfU = one, then there exists exactly one term of the minimal weight. Assume 
U = one. We have to show that (i) there exists no unary function symbol 
/ of the minimal weight and (ii) there exists exactly one constant of the 
minimal weight. Let / be a unary function symbol. By our construction, 
Wf >0 belongs to D. By step (T2) Wf >0 does not belong to ID=, so by the 
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definition of w we have w{f) > 0. By our construction, Wc—We > 0 belongs to 
D for every constant c. By step (T4), at most one of such inequalities belongs 
to D=. Note that ifwc — We>0 does not belong to D^, then w(c) — w(e) > 0 
by the construction of w. Therefore, there exists at most one constant of the 
minimal weight. 

3. If d € G (respectively d G for some constant d, then d is the greatest 
(respectively least) term among the terms of the minimal weight. We consider 
the case d G G, the case d G L is similar. Note that by step (T2) there is 
no unary function symbol / such that Wf > 0 belongs to D^, therefore 
w{f) > 0 for all unary function symbols /. This implies that only constants 
may have the minimal weight. But by step (Tl) for all constants c of the 
minimal weight we have d^ c, and hence also d c. 

4. For every tuple inequality (li,Li) > (ri,Ri) in K and every substitution a 
minimal for M we have {ha, Lia) >- {ria, Riu) . In the proof we will use the 
fact that w(e) is the minimal weight. 

By our construction (step M3), the inequality W(h,ri) does not belong to 
(otherwise {h,Li) > {n,Ri) would be removed at one of steps (M4)- 
(M8)). In Lemma ITCI we proved that \ha\ — \ria\ does not depend on a, 
whenever cr is minimal for M. 

It follows from the definition of W that if W{h, r^) G B) — then \hOc\ > 
\ri0c\. Therefore, \ha\ > |rjcr| for all substitutions a minimal for M. 

5. extends This follows immediately from our construction. 

5.4 Time Complexity 

Provided that we use a polynomial-time algorithm for solving homogeneous lin- 
ear inequalities, and a polynomial-time algorithm for transitive closure, we can 
prove the following lemma (for a proof see cni). 

Lemma 22. The algorithm runs in time polynomial of the size of the system of 
rewrite rules. 



5.5 A Simple Example 

Let us consider how the algorithm works on the rewrite rule g{x, a, b) — >■ g{b, b, a) 
of Example ^ Initially, K consists of one tuple inequality 

{g{x,a,b)) > {g{b,b,a)) (4) 

and D consists of the following linear inequalities: 

Wg >0, We > 0 , Wa — We > 0 , Wb — We > 0 . 

At step (Ml) we note that n{x,g{x,a,b)) = 1 > n{x, g{b,b,a)) = 0 . Therefore, 
we add x to M. 

At step (M3) we add the linear inequality We — Wb > 0 to D obtaining 
Wg > 0, We > 0 , Wa — We > 0 , Wb ~ Wg > 0 , Wg ~ Wb > 0 . 
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Now we compute . It consists of two equations Wb — We > 0 and We~Wb> 0, 
so we have to apply one of the steps (M4)-(M8) , in this case the applicable step 
is (M5). We replace (0 by 



(x, a, b) > {b, b, a). (5) 

At the next iteration of step (M3) we should add to D the linear inequality 
We ~ Wb > 0, but this linear inequality is already a member of D, and moreover 
a member of D^. So we proceed to step (M7). At this step we set L = {6} and 
replace I0) by 



(a,b)>{b,a). (6) 

Then at step (M2) we add Wa — w;, > 0 to ID) obtaining 

Wg > 0, We > 0, Wa — We > 0, Wb ~ We > 0, We ~ Wb > 0, Wa ~ Wb > 0. 

Now Wa—Wb > 0 does not belong to the degenerate subsystem of D, so we proceed 
to TERMINATE. Steps (T1)-(T4) change neither ID nor so we terminate with 
success. 

Solutions extracted according to Lemma will be any pairs (w, ^) such 
that w{a) > w{b). Note that these are not all solutions. There are also solutions 
such that w{a) = w(b) and a ^ b. However, if we try to find all solutions we 
cannot any more guarantee that the algorithm runs in polynomial time. 



6 Main Results 

Lemmas HEn guarantee that our algorithm is correct. Lemma E3 implies the 
algorithm runs in polynomial time. Hence we obtain the following theorem. 

Theorem 1. The problem of the existence of a KBO which orients a given 
rewrite rule systems can be solved in polynomial time. 

In jl l)j . using a similar technique, we prove the following theorem. 

Theorem 2. The problem of solving a Knuth-Bendix constraint consisting of a 
single inequality can he solved in polynomial time. 

The real-valued Knuth-Bendix order is defined similar in the same way as 
above, except that the range of the weight function is the set of nonnegative real 
numbers. The real- valued KBO was introduced in H3|. Note that in view of the 
results of Section Elon systems of homogeneous linear inequalities (Lemmas Eland 
0 the algorithm is also sound and complete for the real- valued orders. Therefore, 
we have 

Theorem 3. If a rewrite rule system is orientable using a real-valued KBO, 
then it is also orientable using an integer-valued KBO. 

Acknowledgments. The authors are partially supported by grants from EP- 
SRC. 



Verifying Orientability of Rewrite Rules Using the Knuth-Bendix Order 



153 



References 

1. F. Baader and T. Nipkow. Term Rewriting and All That. Cambridge University 
press, Cambridge, 1998. 

2. H. Comon. Solving symbolic ordering constraints. International Journal of Foun- 
dations of Computer Science, 1(4):387-411, 1990. 

3. H. Comon and R. Treinen. Ordering constraints on trees. In S. Tison, editor. Trees 
in Algebra and Programming: CAAP’94, volume 787 of Lecture Notes in Computer 
Science, pages 1-14. Springer Verlag, 1994. 

4. N. Dershowitz and D.A. Plaisted. Rewriting. In A. Robinson and A. Voronkov, 
editors. Handbook of Automated Reasoning, volume I, chapter 9, pages 533-608. 
Elsevier Science, 2001. 

5. D. Beliefs and R. Forgaard. A procedure for automatically proving the termination 
of a set of rewrite rules. In J.-P. Jouannaud, editor. Rewriting Techniques and 
Applications, First International Conference, RTA-85, volume 202 of Lecture Notes 
in Computer Science, pages 255-270, Dijon, France, 1985. Springer Verlag. 

6. J. Dick, J. Kalmus, and U. Martin. Automating the Knuth-Bendix ordering. Acta 
Informatica, 28(2):95-119, 1990. 

7. D. Knuth and P. Bendix. Simple word problems in universal algebras. In J. Leech, 
editor, Computational Problems in Abstract Algebra, pages 263-297. Pergamon 
Press, Oxford, 1970. 

8. K. Korovin and A. Voronkov. A decision procedure for the existential theory of 
term algebras with the Knuth-Bendix ordering. In Proc. 15th Annual IEEE Symp. 
on Logic in Computer Science, pages 291-302, Santa Barbara, California, June 
2000 . 

9. K. Korovin and A. Voronkov. Knuth-Bendix constraint solving is NP-complete. 
Preprint CSPP-8, Department of Computer Science, University of Manchester, 
November 2000. 

10. K. Korovin and A. Voronkov. Verifying orientability of rewrite rules using the 
knuth-bendix order. Preprint CSPP-11, Department of Computer Science, Uni- 
versity of Manchester, March 2001. 

11. M.S. Krishnamoorthy and P. Narendran. On recursive path ordering. Theoretical 
Computer Science, 40:323-328, 1985. 

12. P. Lescanne. Term rewriting systems and algebra. In R.E. Shostak, editor, 7th 
International Conference on Automated Deduction, CADE-7, volume 170 of Lecture 
Notes in Computer Science, pages 166-174, 1984. 

13. U. Martin. How to choose weights in the Knuth-Bendix ordering. In Rewriting 
Techniques and Applications, volume 256 of Lecture Notes in Computer Science, 
pages 42-53, 1987. 

14. P. Narendran, M. Rusinowitch, and R. Verma. RPO constraint solving is in NP. In 
G. Gottlob, E. Grandjean, and K. Seyr, editors, Computer Science Logic, 12th In- 
ternational Workshop, CSL’98, volume 1584 of Lecture Notes in Computer Science, 
pages 385-398. Springer Verlag, 1999. 

15. R. Nieuwenhuis. Simple LPO constraint solving methods. Information Processing 
Letters, 47:65-69, 1993. 

16. A. Schrijver. Theory of Linear and Integer Programming. John Wiley and Sons, 
1998. 




Relating Accumulative and Non-accumulative 
Functional Programs 



Armin Kiihnemann^*, Robert Gliick^**, and Kazuhiko Kakehi^*** 

^ Institute for Theoretical Computer Science, Department of Computer Science, 
Dresden University of Technology, D-01062 Dresden, Germany 
kuehneSorchid . inf . tu-dresden . de 
^ Institute for Software Production Technology, 

Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan 
glueck@acm.org and kaz@futamura.info.waseda.ac.jp 



Abstract. We study the problem to transform functional programs, 
which intensively use append functions (like inefficient list reversal), into 
programs, which use accumulating parameters instead (like efficient list 
reversal). We give an (automatic) transformation algorithm for our prob- 
lem and identify a class of functional programs, namely restricted 2- 
modular tree transducers, to which it can be applied. Moreover, since we 
get macro tree transducers as transformation result and since we also give 
the inverse transformation algorithm, we have a new characterization for 
the class of functions induced by macro tree transducers. 

1 Introduction 

Functional programming languages are very well suited to specify programs in 
a modular style, which simplifies the design and the verification of programs. 
Unfortunately, modular programs often have poor time- and space-complexity 
compared to other (sometimes less understandable) programs, which solve the 
same tasks. As running example we consider the following program Pnon that 
contains functions app and rev, which append and reverse lists, respectively. 
For simplicity we only consider lists with elements A and B, where lists are 
represented by monadic trees. In particular, the empty list is represented by the 
symbol vQ The program Pnon is a straightforward (but naive) solution for list 
reversal, since the definition of rev simply uses the function app. 

rev {A a;i) = app {rev Si) {A N) app {A X\) y\ = A {app x\ yi) 

rev {B xi) = app {rev a;i) {B N) app {B X\) y\ = B {app X\ yi) 

rev N = N app N yi = yi 
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introduce types is necessary. 
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For list reversal, the program Pnon has quadratic time-complexity in the length of 
an input list /, since it produces intermediate results, where the number and the 
maximum length of intermediate results depends on the length of 1. Therefore 
we would prefer the following program pacc with linear time-complexity in the 
length of an input list, which uses a binary auxiliary function rev' . 

rev {A a:i) = rev' x\ {A N) rev' {A x\) y\ = rev' x\ {A yi) 

rev {B xi) = rev' Xi {B N) rev' {B a;i) yi = rev' X\ {B yi) 

rev N = N rev' N yi = yi 

Since Pacc reverses lists by accumulating their elements in the second argu- 
ment of rev' , we call Pacc an aecumulative program, whereas we call Pnon non- 
accumulative. Already in it is shown in the context of transforming pro- 
grams into iterative form, how non-accumulative programs can be transformed 
(non-automatically) into their more efficient accumulative versions. An algo- 
rithm which removes append functions in many cases, e.g. also in Pnon, was 
presented in I23|. In comparison to our transformation technique is more 
general in three aspects (though so far we only have artificial examples to demon- 
strate these generalizations): we consider (i) arbitrary tree structures (instead 
of lists), (ii) functions defined by simultaneous recursion, and (iii) substitution 
functions on trees (instead of append) which may replace different designated 
symbols by different trees. On the other hand, our technique is restricted to 
unary functions (apart from substitutions), though also in j'i.'tj the only exam- 
ple program involving a non-unary function could not be optimized. Moreover, 
since we formally describe the two different program paradigms and since we also 
present a transformation of accumulative into non-accumulative programs, we 
obtain the equality of the classes of functions computed by the two paradigms. 

Well-known techniques for eliminating intermediate results cannot improve 
Pnon- (i) deforestation m and supercompilation suffer from the phe- 

nomenon of the obstructing function call j2] and (ii) shortcut deforestation d 
d is hampered by the unknown number of intermediate results. In an exten- 
sion of shortcut deforestation was developed which is based on type-inference and 
splits function definitions into workers and wrappers. It successfully transforms 
Pnon into Pace but is also less general in the above mentioned three aspects with 
respect to the transformation of non-accumulative into accumulative programs. 

In 1 1 711 iS] it was demonstrated that sometimes composition and decomposi- 
tion techniques for attribute grammars m and tree transducers mu 

can help, when deforestation fails. For this purpose we have considered special 
functional programs as compositions of macro tree transducers (for short mtts) 
Every function / of an mtt is defined by a case analysis on the root sym- 
bol c of its first argument t. The right-hand side of the equation for / and c may 
only contain [extended] primitive-recursive function calls, i.e. the first argument 
of a function call has to be a variable that refers to a subtree of t. Under cer- 
tain restrictions, compositions of mtts can be transformed into a single mtt. The 
function app in Pnon is an mtt, whereas the function rev in Pnon is not an mtt 
(since it calls app with a first argument that differs from a;i), such that these 
techniques cannot be applied directly. 
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In this paper we consider Pnon as 2-modular tree transducer (for short 2- 
modtt) Q, where it is allowed that a function in module 1 (here rev) calls a 
function in module 2 (here app) non-primitive-recursively. Additionally, the two 
modules of Pnon fulfill sufficient conditions {rev and app are so called top-down 
tree transducer- and substitution- modules, respectively), such that we can apply 
a decomposition step and two subsequent composition steps to transform Pnon 
into the (more efficient) mtt Pacc- Since these constructions (called accumulation) 
transform every 2-modtt that fulfills our restrictions into an mtt, and since we 
also present inverse constructions (called deaccumulation) to transform mtts into 
the same class of restricted 2-modtts, we get a nice characterization of mtts in 
terms of restricted 2-modtts. 

Besides this introduction, the paper contains five further sections. In Section 
2 we fix elementary notions and notations. Section 3 introduces our functional 
language and tree transducers. Section 4 and Section 5 present accumulation and 
deaccumulation, respectively. Finally, Section 6 contains future research topics. 

2 Preliminaries 

We denote the set of natural numbers including 0 by IV and the set IN — {0} 
by IV+. For every m G IN, the set {!,..., m} is denoted by [m\. The cardinality 
of a set K is denoted by card{K). We will use the sets X = {xi, X2, X3 , . . .}, 
Y = {j/i, ?/2, 2/3, • ■ •}, and Z = {z} of variables. For every n G IN, let Xn = 
{xi,..., Xn} and ¥„ = {yi, . . . , yn}- In particular, Xq = Yq ^ 0. 

Let => be a binary relation on a set K. Then, =>* denotes the transitive, 
reflexive closure of =^. If fc =>* k' for k,k' G K and if there is no k” G K such 
that k' => k" , then k' is called a normal form of k with respect to =^, which is 
denoted by nf{^, k), if it exists and if it is unique. 

A ranked alphabet is a pair {S, rank) where S' is a finite set and rank is a 
mapping which associates with every symbol s G S a, natural number called the 
rank of s. We simply write S instead of (S, rank) and assume rank as implicitly 
given. The set of elements of S with rank n is denoted by The set of trees 
over S, denoted by Ts, is the smallest subset T C (SU{(, )})* such that S^°^ C T 
and for every s G S^"^ with n G IN+ and t\, . . . ,tn GT\ {s ti . . .tn) G T. 

3 Language 

We consider a simple first-order, constructor-based functional programming lan- 
guage P as source and target language for our transformations. Every program 
p G P consists of several modules and every module consists of several function 
definitions. The functions of a module are defined by a complete case analysis 
on the first argument {recursion argument) via pattern matching, where only 
flat patterns are allowed. The other arguments are called context arguments. If 
in the right-hand side of a function definition there is a call of a function that is 
defined in the same module, then the first argument of this function call has to 
be a subtree of the first argument in the corresponding left-hand side. 
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For simplicity we choose a unique ranked alphabet Cp of constructors, which 
is used to build up input trees and output trees of every function in p. In example 
programs and program transformations we relax the completeness of function 
definitions on Tc^ by leaving out those equations, which are not intended to 
be used in evaluations. Sometimes this leads to small technical difficulties, but 
avoids the overhead to introduce types. 

Definition 1 Let C and F be ranked alphabets of constructors and function 
symbols (for short functions), respectively, such that F^°'> = 0 and X, Y,C, and F 
are pairwise disjoint. We define the classes P, M, D, and R of programs, modules, 
function definitions, and right-hand sides, respectively, by the following grammar 
and the subsequent restrictions. We assume that p, m, d, r, c, and / (also 
equipped with indices) range over the sets P, M, D, R, C, and F, respectively. 



p ::= mi . . .mi 

m ::= d\ . . .dh 

d ::= f {ci xi . . .Xk^) yi ■ ■ - yn = ri 

f {Cq Xl . . .XkJ yi ■ ■ - Vn = rq 
r ::= X, \ pj \ cri . . .rk \ f ro ri . . .rn 



(program) 

(module) 

(function definition) 



(right-hand side) 



The sets of constructors, functions, and modules that occur inp € P are denoted 
by Cp, Fp, and Mp respectively. The set of functions that is defined in to € Mp 
is denoted by F^- For every i,j G [?] with i ^ j: F^^ = 0- For every i G [^] 

and / G Frui, the module mi contains exactly one function definition for /. For 
every i G [1], f G and c G there is exactly one equation of the form 



f {c xi . . .Xk) yi ■ ■ .yn = rhs{f, c) 

with rhs{f,c) G RHS{Fra,,Cp U {Fp - F^f),Xk,Yn), where for every F' C F, 
C" C C U F, and k,n G IN, RHS{F' , C , Xk, Y„) is the smallest set RHS with: 

— For every / G p'C+i)^ j g and ci, . . . , To G RHS: {f Xi ri . . . r^) G RHS. 

— For every c G and ri, . . . , ra G RHS: (c ri . . . To) G RHS. 

— For every j G [n]: yj G RHS. □ 



Example 2 

“ Pnon G P where Mp^^^ contains two modules mnon,rev and mnon.app contain- 
ing the definitions of rev and app, respectively. 

^ Pace G P where Mp^^^ contains one module mace, rev containing the definitions 
of rev and rev'. 

— Let pfre be the program 



app {A xi) yi = A {app xi yi) 
app {B xi) yi = B {app Xi yi) 
app N yi =yi 

= app {int Xi) {int X 2 ) 

= A {int Xi) 

= B {int x\) 

= N 



rev {A xi) = APP {rev x\) {A N) 
rev {B xi) = APP {rev xi) {B N) 
rev N = N 
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Then, p^e € P where Mp^ contains three modules mfre,rev, rnfre^app, and 
TUfre^int Containing the definitions of rev, app, and int, respectively. □ 

Now we will introduce a modular tree transducer as hierarchy (mi, . . . , m„) of 
modules, where the functions in a module rrij may only call functions defined in 
the modules mj , . . . , m„. Moreover, we will define interpretation-modules which 
interpret designated constructors as function calls, and substitution-modules 
which substitute designated constructors by context arguments. 



Definition 3 Let p G P. 

— A sequence (m-i, . . . , m„) with u G and mi, . . . , m„ G 

modular tree transducer (for short u-modtt), iff Fm} yf 0 and for every i G 



Mp is called u- 



f G and c £ rhs{f,c) G RHS{F^„CpU[j^^^^^^,^F^.,Xk,Y,,). 

We call Frrii and IJ^ 



.. - wi+i<i<u the set of internal and external functions 

(cf. also jSj) of mi, respectivelya 

— A 1-modtt (mi), abbreviated by mi, is called macro tree transducer (for 
short mtt). 

— An mtt m with F^ = Fm'^ is called top-down tree transducer (for short tdtt) 

| lH^l |- _ 

— A module m G Mp is also called mtt-moduleu 

— An mtt-module m G Mp with = Pm^ is called tdtt-module. 

— A tdtt-module m G Mp is called interpretation-module (for short int-module), 
iff card{Fm) = 1, and there is Cm Q Cp such that m contains for int G Fm 
and for every k £ IN, c £ Cm'^ and for some fc £ {Fp — Fm)^^^ the equation 



int {c x\ . . . Xk) = fc {int Si) . . . {int Xk) 
and for int G Fm and for every k G IN and c £ {Cp — Cm)^^'^ the equation 
int {c xi . . . Xk) = c {int x\) . . . {int Xk)- 

— An mtt-module m £ Mp is called substitution-module (for short sub-module), 
iff there are n G IN and distinct tti, . . . , 7t„ £ such that card{Fm~^^'^) = 
1 , card{Fm~^^^) = 0 for every i > n, card{Fm~^^'^) < 1 for every i < n, and 
m contains for every i G IN, subi G Fm^^\ and j G [i\ the equation 



subi TTj yi...yi = yj 

and for every i £ IN, subi £ Fm~^^\ k G IN, and c £ {Cp — {tti, . . . ,7Ti})(^i 
the equation 



sub^ {cxi . . .Xk) yi ■ ■ - yi = c {subi xi yi . . .yi) . . . {sub^ Xk yi ■ ■ ■ yi). □ 

^ Our definition of modtts differs slightly from that in pj, since it allows a variable 
Xi in a right-hand side only as first argument of an internal function. Arbitrary 
occurrences of Xi can be achieved by applying an identity function to Xi, which is 
an additional internal function. Further note that the assumption Fm'i 0 only 
simplifies our presentation, but could be avoided. 

^ A module m is not necessarily an mtt, since it may call external functions. 
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Example 4 

— {mnon,rev , mnon,app) IS a 2-modtt, m„on,rev IS & tdtt-module, and mnon,app is 

a sub-module (where n=l,TTi = N, = {app}, and = 0). 

“ iTiacc,rev IS an mtt-module and a 1-modtt, thus also an mtt. 

— nifre,rev IS 8t tdtt, {mfre^intT ’^fre,app) IS 8t 2-modtt, mfre,int IS an int-module 

(where = {APP} and /^pp = app), and mfre,app is a sub-module. 

□ 

We fix call-by-name semantics, i.e. for every program p G P we use a call-by- 
name reduction relation =>p on Tc^yPp- It can be proved in analogy to 0 that for 
every program p G P, u-modtt (toi, . . . , m„) with mi, . . . , m^ G Mp, f G Fm}, 
and t G Tcp the normal form nf{=^p, (/ t)) exists. The proof is based on the 
result, that for every modtt the corresponding (nondeterministic) reduction re- 
lation is terminating (and confluent). This result can also be extended to normal 
forms of expressions of the form (/„ {fn-i • ■ • (/2 (/i t )) . . .)), where every fi is 
a unary function of the first module of a modtt in p. 

In the framework of this paper we would like to optimize the evaluation of 
expressions of the form (/ t), where t is a tree over constructors. Since the par- 
ticular constructor trees are not relevant for the transformations, we abstract 
them by a variable z, i.e. we handle expressions of the form (/ z). The transfor- 
mations will also deliver expressions of the form (/2 (/i z)). All these expressions 
are initial expressions for programs, which are defined as follows. 

Definition 5 Let p G P and let / range over {/ | there is a modtt (mi, . . . , m^) 
with mi, . . . ,m„ G Mp such that / G Fm}}- The set of initial expressions for p, 
denoted by Ep, is defined as follows, where e ranges over Ep-. 
e ::= f e \ z (initial expression for a program) □ 

Example 6 

— z, (rev z), and {rev {rev z)) are initial expressions for Pnon and for Pacc- 

— z, {rev z), {int z), and {int {rev z)) are initial expressions for pfre- □ 

Definition 7 Let p G P and e G Ep. The function e : Tc^ — t Tc^, defined by 
Tp,e{t) = nf{^p,e[z/t]) for every t G Tc^, where e[z/t] denotes the substitution 
of z in e by t, is called the tree transformation indueed by p and e. The class 
of tree transformations induced by all programs p G P and initial expressions 
{fn {fn-i • • ■ (/2 (/i z)) ■■■)) s Ep with 71 > 1, such that p = mi... mi with 
mi,...,m; G Mp, and for every i G [n] there is ji G [1] with fi G Fm^.^ is 
denoted by 

Mji ; Mj.^ Mj^ , where Mj. is 

— T, if m^^ is a tdtt, 

— MT , \i mj^ is an mtt, and 

— u-ModT{M'i, . . . , M{f), if there is a u-modtt {m}, . . . , m(^) with m'l, . . . , m(^ G 
Mp and mj^ = m[, where Mj is 
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• T, if m' is a tdtt-module, 

• INT, if m' is an int-module, and 

• SUB, if m' is a sub-module. □ 

Example 8 

“ '^PaccArev z) G MT 

- TpaccArev (rev z)) G dfT ; MT 

~ ’’'pnonArev z) G 2-ModT{T, SUB) 

- (rev Z)) G T ; 2-ModT{INT, SUB) Gl 

Our program transformations should preserve the semantics, i.e. should trans- 
form pairs in {(p, e) \ p G P,e € Ep} into equivalent pairs. Since some transfor- 
mations will introduce new constructors, the following definition of equivalence 
can later be instantiated, such that it is restricted to input trees over “old” 
constructors. 

Definition 9 For every i G [2] let (pi,ei) G {(p, e) |p G P, e G Ep}. Let C C 
Cpj r\Cp^. The pairs (pi,ei) and (p 2 ,e 2 ) are called equivalent with respect to C , 
denoted by (pi,ei) =c (^ 2 , 62 ), if for every tGTc: Tp^,ei(t) = Tp.^,eAt)- 

4 Accumulation 

We would like to give an algorithm based on results from the theory of tree 
transducers, which translates non-accumulative programs of the kind “inefficient 
reverse” into accumulative programs of the kind “efficient reverse”. According 
to our definitions in the previous section this means to compose the two modules 
of a 2-modtt to an mtt. A result in |7| shows that in general this is not possible, 
since there are 2 -modtts, such that their induced tree transformations can even 
not be realized by any finite composition of mtts. Fortunately, there are sufficient 
conditions for the two modules, under which a composition to an mtt is possible, 
namely, if they are a tdtt-module and a sub- module, respectively. 

Theorem 10 2-ModT{T, SUB) C MT. 

Proof. Follows from Lemma 1^ Corollary [El and Lemma E] □ 

4.1 Freezing 

Surprisingly, in the first step we will decompose modtts. We use a technique, 
which is based on Lemma 5.1 of 0: The first module of a modtt can be sepa- 
rated from the rest of the modtt by freezing the occurrences of external functions 
in right-hand sides, i.e. substituting them by new constructors. These new con- 
structors are later activated by an interpretation function, which interprets every 
new constructor c as call of that function /c, which corresponds to c by freezing. 

Lemma 11 2-ModT{T, SUB) C T; 2-ModT{INT, SUB). 

Proof. Let p G P and e G Ep, such that mi, m 2 G Mp, (mi, m 2 ) is a 2-modtt, 
mi is a tdtt-module, m 2 is a sub-module, and e = (f z) for some / G Pmi- We 
construct p' G P from p by changing mi and by adding the int-module m 3 : 
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1. For every g € and c € the equation g {c X\ . . . Xk) = rhs{g, c) of 

mi is replaced hy g {c Xi . . . Xk) = freeze irhsjq, c)), where freeze irhsjq, c)) 
is constructed from rhs{g,c) by replacing every occurrence of subi € 

by a new constructor SUBi e (C — Thus, mi becomes a tdtt. 

2. m 3 contains for a new function int G (F — Fp)^^^ and for every c G Cp^'^ the 
equation 

int {c Xi . . .Xk) = c {int Xi) ... {int Xk) 
and for every subi G Fmt'^'^ the equation 

int {SUBi xi . . . Xi+i) = subi {int xi) . . . {int Xi+i). 

Thus, (m 3 , m 2 ) is a 2-modtt. 

Let e' = {int (/ z)), i.e. e' G Epi. Then, (p, e) =Cp (p^ e') (cf. 0). □ 

Example 12 Freezing translates Pnon with initial expression {rev z) into pfre 
with initial expression {int {rev z)). □ 

4.2 Integration 

Now, having 2-modtts with int- and sub-module, we can compose the two mod- 
ules to an mtt. The main idea is to integrate the behaviour of the interpretation 
into the substitution. This is best explained by means of our example: The equa- 
tion int {APP xi X 2 ) = app {int x\) {int X 2 ) is replaced by the new equation 
app {APP xi X 2 ) yi = app xi {app X2 j/i), which interprets APP as function 
app and sends app into the subtrees of APP. Note that the new equation rep- 
resents the associativity of app, if we interpret APP also on the left-hand side 
as app. The associativity of app, which is the function of a special sub-module 
(cf. Example^, plays already in |l| 2 ,'lj an important role, and, for functions of 
general sub-modules, it will be proved as basis for our integration technique. 

Lemma 13 2-ModT{INT, SUB) C MT. 

Proof. Let p G P and e G Ep, such that mi, m 2 G Mp, {mi, m 2 ) is a 2-modtt, 
mi is an int-module, m 2 is a sub-module, and e = {int z) with F^^ = {int}. We 
construct p' G P hy replacing mi and m 2 in p by the following mtt m: 

1. Every equation subi (c xi . . .Xk) yi .. .yi = rhs{subi, c) with subi G 

and c G {Cp — (cf. Def. Elfor Cmi) of m 2 is taken over to m, where 

every occurrence of subi G is changed into sub{ G Fm~^^\ 

2. If Fml = 0, then for a new function sub'^ G {F — Fp)^^~^ and for every 
c G {Cp — Crm)^^'^ we add sub'^ (c Xi . . .Xk) = c {subQ xl) . . . (sm&q Xk) to m. 

3. For every sub{ G and SUBj G Cmt^\ the equation 

sub'i {SUBj Xl X 2 . . . Xj+i) yi...yi 
= subj Xl {sub'i X2 yi . ■ .yi) . . . {sub'^ Xj+i yi . . . yi) 



is added to m. 



162 



A. Kuhnemann, R. Gliick, and K. Kakehi 



Let e' = (sM&g z), i.e. e' G Epi. Then (p,e) =Cp {p',e'), since for every t G 
and subi G 

{int t)) 

= nf{^p, {subi {int t) tti . . .m)) (“tts are substituted by tts” (Struct. Ind.)) 
= nf{^p',{sub[t TTl . . .TT^)) (*) 

= nf{^p>, {subg t)) (“tts are substituted by tts” (Struct. Ind.)) 

The statement (*) For every subi G Fmt^\ and t,ti, . . . ,ti G Tc^ : 

nf{^p, {subi {int t) ti . ..U)) = nf{^p>,{sub{ t ti . ..U)). 
is proved by structural induction on t G Tc^- We only show the interesting case 
t = {SUBj tg t\. . . tj) with SUBj G and tg, . . . , t) G Tc^: 

nf{^p, {sub^ {int {SUBj t'g t[ . . . tb)) ti . . . U)) 

= nf{^p, {subi {subj {int tg) {int t'j) . . . {int t' )) ti . . . ti)) (Def. int) 

= nf{^p, {subj {int tg) {subi {int t'l) ti . . .ti) . . . (**) 

{sub^ {int t'j) ti . ..U))) 

= nf{^p, {subj {int tg) nf{^p, {subi {int t'l) t\ . . . ti )) . . . (“Split nf”) 
nf{^p, {subi {int t'j) ti . --U)))) 

= nf{^p>, {sub'j tg nf {^p>,{sub'i t'j h . . . U )) . . . (Ind. Hyp. (*)) 

nf{^p',{sub'i t'j ti . ..ti)))) 

= nf{^pi, {sub'j tg {sub'i t'j tj . . .U) . . . {sub'i t'j tj . . . ti))) (“Collect n/”) 

= nf{^p>, {sub'i {SUBj t'jj t'j. .. t'j) tj. . . ti)) (Def. sub'j) 

The associativity of substitutions 

(**) For every subj G F^^^\subj G F^^^\so, Sj,...,Sj G Tcp-c „,^ , 
and tj,...,ti& Tcp : 

nf{^p, {subi {subj sg si . . . Sj) tj . . . U)) 

= nf{^p, {subj Sg {subi Sj tj . . .tj) . . . {subi Sj tj . ..U))). 

is also proved by structural induction on sg G Tcp-c^i ■ We only show the 
“central” case sg = with k G [j]: 

nf{^p, {subi {subj TTfc si . . . Sj) tj . . . U)) 

= nf{^p, {subi Sk tj . ..U)) 

= nf{^p, {subj TTfc {subt sj tj . . .U) . . . {subi Sj tj . . .U))) 

Example 14 Integration translates pfre with initial expression {int z) into the 
following program pint with initial expression {app'jj z): 

rev {A xj) = APP {rev xj) {A N) app'g {APP xj X2) = app' xj {app'jj X2) 

rev {B xj) = APP {rev xj) {B N) app'jj {A xj) = A {app'g xj) 

rev N = N app'o {B xj) = B {app'g xj) 

app'g N = N 

app' {APP xj X2) yi= app' xj {app' X2 yi) 
app' {A xj) yi = A {app' xj yj) 

app' {B xj) yj = B {app' xj yj) 

app' N yj =yj 
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Let p,p' G P, (/i z), (/2 z), (/2 (/i z)) G Ep, (/i z), (/^ z), (/^ (/i z)) G and 
C" = Cp n Cp/. If (p, (/i z)) =c' (/, (/i z:)) and (p, (/2 z)) =c- (p', (/2 z)), then 
(p, (/2 (/i z))) =<7/ (p', (/2 (/i z))). Thus in particular we get from Lemma El 

Corollary 15 T; 2 -ModT{INT , SUB) C T; MT. □ 

Example 16 Integration translates p/re with initial expression {int {rev z)) into 
Pint with initial expression {app^ {rev z)). □ 

4.3 Composition 

In fE| it was shown that the composition of two tdtts can be simulated by only 
one tdtt. This result was generalized in m, where in particular a composition 
technique is presented which constructs an mtt m for the composition of a tdtt 
mi with an mtt m2. The central idea is the observation that, roughly speaking, 
intermediate results are built up from right-hand sides of mi. Thus, instead of 
translating intermediate results by m2, right-hand sides of mi are translated by 
m2 to get the equations of m. For this purpose, m uses x Fm^ as function 
set. In the following, we abbreviate every pair {f,g) G Fm-^ x by fg. 

In our construction it will be necessary to extend the call-by-name reduction 
relation to expressions containing variables (they are handled like 0-ary construc- 
tors) and to restrict the call-by-name reduction relation to use only equations of 
a certain mtt m, which will be denoted by =>m. 

Lemma 17 T ; MT C MT. 

Proof. Let p € P and e G Ep, such that mi, m2 G Mp, mi is a tdtt, m2 is an 
mtt, and e = (/2 (/i z)) for some /i G Fm^ and /2 G Fml- We construct p' G P 
by replacing mi and m2 in p by the following mtt m: 

1 . From m2 we construct an mtt m2 which is able to translate right-hand sides 
of equations of mi. Note that m2 is not part of p' . 

— fri2 contains the equations of m2. 

— For every g G and / G we add the following equation to m2: 

9 (/ xi) yi---yn = fgxiyi.-.yn 

where every / G F^j and fg with / G F^j and g G is viewed as 

additional unary and (n -I- l)-ary constructor, respectively^ 

2 . Let Fi”+^^ = {fg\f G Frm,g G For every g G / G F^^^, 

and c G C^\ such that / {c xi . . . Xk) = rhs{f, c) is an equation in mi, m 
contains the equation 

fg {c Xi . . .Xk) yi ■ . - yn = nf{^ni 2,9 {rhs{f,c)) yi . 

In a strong sense, fgxiyi... yn is not a legal right-hand side, since xi does not occur 
as hrst argument of a function symbol, but of a constructor. Again, an additional 
identity function would solve the problem formally. 
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Let e' = (/ 1/2 z), i.e. e' G Ep/. Then, (p,e) =Cp {p',e'). We omit the proof 
and only mention that in another construction was used, which first splits 
the mtt into a tdtt and a so called yield function, which handles parameter 
substitutions (cf. also Subsection H.ll) . then composes the two tdtts, and finally 
joins the resulting tdtt to the yield function. We get the same transformation 
result in one step by avoiding the explicit splitting and joining (cf. also ^B|). □ 



Example 18 Composition translates pint with initial expression {app^ {rev z)) 
into a program and an initial expression, which can be obtained from pacc and 
{rev z), respectively, by a renaming of functions: 

Let Mp.^^ contain the tdtt mint, rev and the mtt mint,app containing the defi- 
nitions of rev and of app^ and app', respectively. The mtt fhint,app is given by: 



appQ {APP Xi X 2 )= app' xi {app'^ X 2 ) app' 



app'Q {A xx) = A {app'o xi) app' 

apPo {B xi) = B {app'g Xi) app' 

app'^ N = N app' 

app'g {rev x\) = revapp'^ X\ app' 



{APP xi X 2 ) yi= app' xi {app' X 2 yi) 
{A xi) yi = A {app' xi yi) 

{Bxx)yi =B{app'xiyi) 

N yi =yi 

{rev xi) yi = revapp' x\ yi 



The new program contains the following equations with underlined left- and 
right-hand sides: 



revapp' {A xi) y\ 

= app' {rhs{rev,A)) yi) 

= nf{^rui„t,apr^ °-PP' Xi) {A N)) yi) 

= nf{^rfH„t,apr^ °-PP' Xi) {app' {A N) yi)) 
= revapp' xi {app' {A N) yi)) 

= revapp' Xi {A yi) 

revapp'^ {A x\) 

= {rhs{rev,A))) 

= apPo {APP {rev xi) {A N))) 

= nf{^rn,r,t,a„i app' {rev Xi) {app'g {A N))) 

= revapp' xi {app'o {A N))) 

= revapp' xi {A N) 



revapp' {B xi) yi 
= ... = revapp' Xi {B yi) 



revapp' N y\ 

= .■■ = yi 

revapp'^ {B xi) 

= . . . = revapp' xi {B N) 



revapp'^ N 
= ... = TV 



The new initial expression is {revapp'^ z). Note that by a result in ^S] we could 
have used also deforestation to get the same transformation result. □ 



5 Deaccumulation 

This section shows that results from the theory of tree transducers can also be 
applied to translate accumulative programs of the kind “efficient reverse” into 
non-accumulative programs of the kind “inefficient reverse” , thereby proving that 
the classes 2-ModT{T, SUB) and MT are equal. So far we consider this trans- 
formation direction as purely theoretical, but there may be cases, in which the 
non-accumulative programs are more efficient than their related accumulative 
versions (see Sectional). 
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Theorem 19 MT C 2-ModT{T, SUB). 

Proof. Follows from Lemma E2 and Lemma ESI n 

From Theorems Eni and ir?Hwe get a new characterization of MT: 

Corollary 20 MT = 2-ModT{T, SUB). □ 

5.1 Decomposition 

We have already mentioned that an mtt can be simulated by the composition 
of a tdtt with a yield function m In this paper we will use a construction 
for this result, which is based on the proof of Lemma 5.5 in jZj. The key idea 
is to simulate the task of an (n + l)-ary function g by a unary function g'^ 
Since g' does not know the current values of its context arguments, it uses a 
new constructor tTj, wherever g uses its j-th context argument. For this purpose, 
every variable yj in the right-hand sides of equations for g is replaced by nj. 
The current context arguments themselves are integrated into the calculation 
by replacing every occurrence of the form (g Xi . . .) in a right-hand side by 
{SUBn {g' Xi)...), where SUB^ is a new constructor. Roughly speaking, an 
int-module interprets every occurrence of SUB^ as substitution, which replaces 
every occurrence of iTj in the first subtree of SUBn by the j-th context argument. 

Lemma 21 MT <ZT; 2-ModT{INT, SUB). 

Proof. Let p € P and e £ Ep, such that m £ Mp is an mtt and e = {f z) 
for some / G . We construct p' £ P hy replacing m in p by the following 
tdtt TOi, int-module m 2 , and sub-module m 3 , where {m 2 , m 3 ) is a 2-modtt. Let 
A= {n £ IN \ yf 0} and mx be the maximum of A. 

1. For every a £ A — {0} let SUBa £ {C — and for every j £ [mx] let 

TTj £ {C — Cp)^^'> be distinct new constructors. 

For every g G Fm^^\ c G Cp^\ and equation g (c x\...Xk) yi---yn = 
rhs{g,c) in m, an equation g (c x\...Xk) = tr{'i'hs{g,c)) with g G Fm} 
is contained in mi, where tr : RHS{Fm,Cp, Xk,Yn) — RHS{Fmi,Cp U 
{SUBa \ a£ A - {0}} U {tti, . . . , 7t„}, Xfc, Tq) is defined by: 

For every a £ h £ Fm'^^\i £ [k], 
and ri,...,Xa £ RHS{Fm, Cp, Xk, T„) : 

tr{h XiTi. . .Ta) = SUBa {h Xi) tr{ri) . . .tr{ra). 

For every h £ Fm'^ and i £ [k]: 
tr{h Xi) = {h Xi). 

For every a£ lN,c' £ C^°'\ and ri, . . . , ra G RHS{Fm, Cp, Xk,Yn) : 
tr{c' ri . . .Ta) = d tr{ri) . . .tr{ra). 

For every j £ [n] : 

triVj) = TTj. 

® In the formal construction g is not renamed. 
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2 . m2 contains for a new function int G {F — FpY^'^ and for every c £ {Cp U 

{tti, . . . , the equation 

int (c Xi . . . Xk) = c {int Xi) . . . {int Xk) 
and for every a £ A — {0} the equation 

int {SUB a x\ . . . Xa+i) = suba {int xi) . . . {int Xa+i)- 

3 . m3 contains for every a G A — {ofl for every new function suha G {F — 
Fp)A+U ^ and for every c £ the equation 

suba {cxi...Xk)yi...ya = c {suba Xi yi . . .ya) . . . {suba Xtyi- - ■ ya) 
and for every j £ [a] the equation 

suba yi ■ . -ya = yj- 

Let e' = {int (/ z)), i.e. e' £ Ep^. Then, (p, e) =c^ {p\s') (cf. 0 ). □ 



Example 22 Decomposition translates Pacc with initial expression {rev z) into 
the following program with initial expression {int {rev z)): 

rev {A xi) = SUBi {rev' Xi) {A N) rev' {A xi) = SUBi {rev' xi) {A tti) 

rev {B xi) = SUBi {rev' x\) {B N) rev' {B xi) = SUBi {rev' xi) {B tti) 

rev N = N rev' N = tti 



int {SUBi xi X2) = subi {int xi) {int X2) 
int {A xi) = A {int Xi) 

int {B xi) = B {int xi) 

int N = N 

int 7Tl = 7Tl 



subi {A xi) yi = A {subi xi yi) 
subi {B Xi) yi = B {subi xi yi) 
subi N yi = N 

subi 7Ti yi = yi 



□ 



5.2 Thawing 

We use an inverse construction as in the proof of Lemma HD to thaw “frozen 
functions” of a tdtt, i.e. we substitute occurrences of those constructors SUBi, 
which the interpretation replaces by functions subi, by occurrences of subi. 

Lemma 23 T; 2 -ModT{INT , SUB) C 2 -ModT{T, SUB) . 

Proof. Let p G P and e £ Ep, such that mi, m2, m3 £ Mp, mi is a tdtt, {m2, m3) 
is a 2-modtt, m2 is an int-module, m3 is a sub-module, and e = {int (/ z)) with 
Fru2 = {int} and for some / £ Umi- We construct p' G P from p by dropping 
m2 and by changing mi: 

For every g G Fmi and c £ the equation g (c xi ...Xk) = rhs{g,c) 

of mi is replaced by g (c xi...Xfe) = thgw{rhs{g,c)), where thaw irhsia, c)) 



If A = { 0 }, i.e. m was already a tdtt, then we construct a “dummy function” suho. 
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is constructed from rhs{g,c) by replacing every occurrence of SUBi G 
(cf. Def. Olfor Cm 2 ) by subi G Fmt^\ iff the equation int (SUBi xi . . . Xi+\) = 
suhi (int xi)...(int Xi+i) is in m 2 - Thus, mi becomes a tdtt-module and 
(mi, ms) a 2-modtt. 

Let e’ = (f z), i.e. e' G Ep>. Then, (p,e) =Cp-C™, (p',e'), since 

(*) For every g G Fm, and t G Tc^-c^C (ff ^))) = 'rif(^p>,(g t))- 

is proved by structural induction on t G Tc^-c m 2 ■ This proof requires another 
structural induction on r G RHS(FmmCp, Xk,Yo) to prove 

(**) For every k G lN,r G RHS(Fm^,Cp, Xk,Yo),{md G Tc^-Cm 2 - 

nf(^p, (int r[xi/ti, . . ■ ,Xk/tk])) = nf (^p' , thaw (r)[x-\ /h , . . .,Xk/tk\), 

where [xi/ti,... ,Xk/tk] denotes the substitution of every occurrence of Xi in r 
and thaw (r), respectively, by □ 

Example 24 Thawing translates pdec with initial expression (int (rev z)) into 
the following program p'^on with initial expression (rev z): 

rev (A X\) = subi (rev' xi) (A N) rev' (A xi) = subi (rev' xi) (A tti) 

rev (B xi) = subi (rev' xi) (B N) rev' (B xi) = subi (rev' xi) (B tti) 

rev N = N rev' N = tti 

subi (A Si) yi = A (subi xi yi) 
subi (B xi) yi = B (subi Xi yi) 
subi N yi = N 

subi 7Ti yi = yi 

It is surprising that p'^on is very similar to Pnon'- subi in p'non corresponds to app 
in Pnoni but substitutes the symbol tti instead of N. rev' in corresponds to 
rev in Pnon, but uses subi and tti instead of app and N. The additional rev in 
p'non achieves that tti does not occur in the output. □ 

6 Future Work 

In this paper we have always considered non-accumulative functional programs 
as less efficient than their related accumulative versions, like we have considered 
in HH the pure elimination of intermediate results as success. This assumption 
is true for so far studied example programs, but may be wrong in general. It 
is necessary to find out sufficient conditions for our source programs, under 
which we can guarantee that the accumulation technique (and maybe also the 
deaccumulation technique, respectively) does not deteriorate the efficiency. In 
particular, is the linearity of programs a sufficient condition for accumulation 
(like for deforestation and tree transducer composition [liSIlbj l? 

We have presented sufficient conditions, such that we can compose the mod- 
ules of a 2-modtt. Is it possible to extend the applicability of the accumulation 
technique by relaxing these conditions or by using other conditions? Addition- 
ally, it would be interesting to analyze u-modtts with it > 2 in this context. 
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Abstract. Gontext unification was originally defined by H. Comon in 
ICALP’92, as the problem of finding a unifier for a set of equations con- 
taining first-order variables and context variables. These context variables 
have arguments, and can be instantiated by contexts. In other words, they 
are second-order variables that are restricted to be instantiated by lin- 
ear terms (a linear term is a A-expression Ari • • • Xxn . t where every Xi 
occurs exactly once in t). 

In this paper, we prove that, if the so called rank-hound conjecture is true, 
then the context unification problem is decidable. This is done reducing 
context unification to solvability of traversal equations (a kind of word 
unification modulo certain permutations) and then, reducing traversal 
equations to word equations with regular constraints. 



1 Introduction 

Context unification is defined as the problem of finding a unifier for a finite set 
of equations where, in addition to first-order variables, we also consider con- 
text variables. These variables are applied to terms, and can be instantiated 
by contexts, i.e. by linear second-order terms. A linear second-order term is a 
A-expression Xx\ - ■ ■ Xxn -t where X\^...^Xn are first-order bound variables and 
occur exactly once in t. Therefore, context unification can be considered as a 
variant of second-order unification where possible instances of second-order vari- 
ables are restricted to be linear. Sometimes, context variables are required to be 
unary. However, this restriction does not help to prove the decidability of the 
problem, and it will not be used in this paper. Given an instance of the problem, 
if it has a solution considered as a context unification problem, then it has also 
a solution as second-order unification problem. Obviously, the converse is not 
true. 

The context unification problem was originally formulated by H. Comon 
in IComh'ii ]omm . There it is proved that context unification is decidable when, 
for any context variable, all its occurrences have the same argument. Later, it 
was proved [SSMESHBISM that the problem is also decidable when context 
variables are stratified, i.e. when, for any variable, the list of context variables 

* This work has been partially supported by the GICYT research projects DENOC 
(BFM 2000-1054-C02-01) and MODELOGOS (TIC 97-0579-C02-01). 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 1 2001. 
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we find going from the root of the term to any occurrence of this variable is 
always the same. It was also proved that a generalization of the problem 

-the linear second-order unification problem, where third-order constants are 
also allowed- is decidable when no variable occurs more than twice. Recently, it 
has been proved |SSS90| that context unification is also decidable for problems 
containing no more than two context variables. The relationship between the 
context unification problem and the linear second-order unification problem is 
studied in [I.VOObj . 

Decidability of context unification would have important consequences in 
different research areas. For instance, some partial decidability results are used 
in k ;omf>'il to prove decidability of membership constraints, in to prove de- 
cidability of distributive unification, in to define a completion procedure 

for bi-rewriting systems. In [FlNUXilH) it is proved that parallelism constraints -a 
kind of partial description of trees- are equivalent to context unification. Domi- 
nance constraints are a subset of parallelism constraints, and their solvability is 
decidable |KN'l’f)S) . Other application areas of context unification include com- 
putational linguistics |NPR,97b] . The common assumption is that context uni- 
fication is decidable. This is because the various restrictions that make context 
unification decidable, when they are applied to second-order unification, they do 
not make it decidable |Lev98ILV00aj . 

In jljev9ti) there is a description of a sound and complete context unification 
procedure, based on Pietrzykowski’s procedure for second-order unifica- 

tion. Like Pietrzykowski’s procedure, this procedure does not always terminate. 
The linearity restriction makes some trivially solvable second-order unification 
problems, like X{a) = X(b), unsolvable when we only consider context unifiers. 
Notice that this problem has only one unifier \X i— >■ Xx .Y] which is not linear 
because x does not occur once in Y . In particular, flexible- flexible pairs, which 
are always solvable in second-order unification, now are not necessarily solvable. 

The bounded second-order unification problem is another variant of second- 
order unification, similar to context unification. There, instances of second-order 
variables are required to use their arguments a bounded number of times. We can 
easily reduce any fc-bounded second-order unification problem, like X{Y{a,b)) = 
Y(X(a),b), to a context unification problem, like 

X(Y{a , .?., a, b, x., b), X.,Y{a , .?., a, b, x., b)) = 

- y {X{a , .?., a), .?., X(a, .?., a),b, x., b) 

nondeterministically, for any possible choice oi p,q,r < k satisfying the bound. 
The converse reduction does not seem easy to find. The bounded second-order 
unification problem has recently been proved decidable fSS99a,| . 

The relationship between context unification and word unification (Mak77| 
was originally suggested in |Lev96j . In |SSS98) it is proved that the expo- 
nent of periodicity lemma also holds for context unification. We can easily 
reduce word unification to context unification by encoding any word unifica- 
tion problem, like F aG = G a F, as a monadic context unification problem 
F{a{G{b))) = G{a{F{b))), where 6 is a new constant. This paper suggests that 
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the opposite reduction may also be possible. In the following Section we motivate 
this statement using a naive reduction. Although it does not work, we will see 
in the rest of the paper how it could be adapted properly. 



2 A Naive Reduction 



Given a signature where every symbol has a fixed arity, we can encode a term 
using its pre-order traversal sequence. We can use this fact to encode a context 
unification problem, like the following one 

X{Y{a,b))^Y{X{a),b) (1) 

as the following word unification problem 

Ao Fo a W bY2Xi = Yq XoaXi Yi 6^2 

We can prove easily that if the context unification problem 0 
its corresponding word unification problem © is also solvable, 
the solution corresponding to the following unifier 

X 1 -^ Xx . f{f{x, b),b) 

Y ^ Xx.Xy. /(/(/(x, 6), y),b) 



( 2 ) 

is solvable, then 
In our example. 



is 

Xo^ff Yo^fff 

Ai i—7> 66 Yi i—7> 6 

^2 i-T 6 

Unfortunately, the converse is not true. We can find a solution of the word 
unification problem which does not correspond to the pre-order traversal of any 
instantiation of the original context unification problem (consider the unifier 
that instantiates Xq, Xi, Yq, Yi and Y 2 by the empty word). Word unification 
is decidable imTTi . and given a solution of the word unification problem we 
can check if it corresponds to a solution of the context unification problem. 
Unfortunately, word unification is also infinitary, and we can not repeat this test 
for infinitely many word unifiers. 

The idea to overcome this difficulty comes from the notion of rank of a term. 
In figure Q there are some examples of terms (trees) with different ranks. Notice 
that terms with rank bounded by zero are isomorphic to words, and those with 
rank bounded by one are caterpillars. For signatures of binary symbols, the rank 
of a term can be defined as follows 



rank(a) = 0 
rank(/(fi,t2)) 



1 -I- rank(fi) if rank(ti) = rank(f 2 ) 

max{rank(ti), rank(t 2 )} if rank(<i) yf rank(f 2 ) 



Alternatively, the rank of a binary tree can also be defined as the depth of the 
greatest complete binary tree that is embedded in the tree, using the standard 
embedding of trees. 
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Fig. 1. Examples of trees with ranks equal to 0, 1, 2 and oo. 



We conjecture that there is a computable function such that, for every 
solvable context unification problem t = u, there exists a ground unifier cr, 
such that the rank of a{t) is bounded by the size of the problem: rank(tT(f)) < 
<?(size(t = u)). 

The other idea is to generalise pre-order traversal sequences to a more general 
notion of traversal sequence, by allowing subterms to be traversed in different 
orders. Then, any rank-bounded term has a traversal sequence belonging to a 
regular language. We also introduce a new notion of traversal equation, noted 
t = u, and meaning t and u are traversal sequences of the same term. We prove 
that a variant of these constraints can be reduced to word equations with regular 
constraints, that are decidable ISEH]. 

The rest of this paper proceeds as follows. In Section 0 we introduce basic no- 
tation. In Section 0 we define the notions of traversal sequence, rank of a traver- 
sal sequence, rank of a term, and normal traversal sequence. Traversal equa- 
tions are introduced in Section]^ There, we prove that solvability of rank- and 
permutation-bounded traversal equations is decidable, by reducing the problem 
to solvability of word equations with regular constraints. In Section El we state 
the rank-bound conjecture. Finally, in Section Q we show how, if the conjecture 
is true, context unification could be reduced to rank- and permutation-bounded 
traversal systems. 

3 Preliminary Definitions 

In this section, we introduce some definitions and notations. Most of them are 
standard and can be skipped. 

We define terms over a second-order signature {S , ft) of constants U = 
Ui>0 and variables ft = lJj>QTi, where any constant f € Ei or variable 
X G Xj has a fixed arity: arity(/) = i, arity(X) = j. Constants from Eq are 
called first-order constants whereas constants from E\Eq are called second- 
order constants or function symbols. Similarly, variables from Xq are first-order 
variables, and those from X\Xq are context variables. First-order terms T^{E, X) 
and second-order terms T^(E,X) are defined as usual. The set of free variables 
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of a term t is denoted by Var(t). The size of a first-order term is defined in- 
ductively by size(/(ti,..., t„)) = 1 - 1 - X)iG[i n] size(ti) being / either a n-ary con- 
stant or variable. The arity of a (/377-normalised) second-order term is defined by 
arity(Aa;i • • • Xxn .t) = n. A second-order term Axi • • • Acc„ . t is said to be linear 
if any bound variable Xi occurs exactly once in t. As far as first-order terms do 
not contain bound variables, any first-order term is linear. 

A position within a term is defined, using Dewey decimal notation, as a 
sequence of integers ii---in, being A the empty sequence. The concatenation 
of two sequences is denoted by p\ ■ p2- The concatenation of an integer and 
a sequence is also denoted hy i ■ p, standing i,j,... for integers and p,q,... for 
sequences. The subterm of t at position p is denoted by t\p. By t[u]p we denote 
the term t where the subterm at position p has been replaced by u. 

The group of permutations of n elements is denoted by 77 „. A permutation 
p of n elements is denoted as a sequence of integers [p(l),..., p(n)]. 

A eontext unifieation problem is a finite sequence of equations {ti A 
Ui}i£[i..n]j being an equation t = u a, pair of first-order terms t,u € T^{E,X). 
The size of a problem is defined by size({fi A = X)iG[i + 

size(ui)) A position within a problem or an equation is defined by 

\ti ifj 

{t A = t\p 
{t A u)\2-p = u\p 

A seeond-order substitution is a finite sequence of pairs of variables and terms 
cr = [Xi I— >■ si,...,Xm >— >■ Sm], where Xi and Si are restricted to have the same 
arity. A eontext substitution is a second-order substitution where the sfs are 
linear terms. A substitution a = [X^ 1— >■ si,..., A„ 1— >■ s„] defines a mapping from 
terms to terms. A substitution cti is said to be more general than another (T2, if 
there exist another substitution p such that ct2 = po a\. 

Given a context unification problem {U A a context [second-order] 

substitution cr = [Xi 1— >■ si,...,Xm >■ Smj) is said to be a context [second-order] 

unifier if cf{ti) = cr(uj), for any i G [l..nj. A unifier a is said to be most general, 
m.g.u. for short, if no other unifier is strictly more general than it. It is said 
to be ground if uftf) does not contain variables, for any i G [l..nj. A context 
unification problem is said to be solvable if it has a context unifier. 

The context unification problem is defined as the problem of deciding if, given 
context unification problem, does it have a context unifier or not. 

Without loss of generality, we can assume that the unification problem only 
contains just one equation t A u. We will also assume that the signature E is 
finite, and that it contains, at least, a first-order constant, and a binary function 
symbol. This ensures that any solvable context unification problem has a ground 
unifier, and we can guess constant symbols in non-deterministic computations. 
If nothing is said, the signature of a problem is the set of symbols occurring in 
the problem, plus a first-order and a binary constant, if required. 

In the appendix we include a variant of the sound and complete context 
unification procedure described in iraEi, and adapted to our actual settings. 
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This procedure can be used to find most general unifiers, and a variant of it, to 
find minimal ground unifiers. 

4 Terms and Traversal Sequences 

The solution to the problems pointed out in the introduction comes from gener- 
alising the definition of pre-order traversal sequence. It will allow us to traverse 
the branches of a tree, i.e. the arguments of a function, in any possible order. 
In order to reconstruct the term from the traversal sequence, we have to an- 
notate the permutation we have used in this particular traversal sequence. For 
this purpose, we define a new signature Sn containing n! symbols for each 
n-ary symbol f G S, where p € iT„ and iT„ is the group of permutations of n 
elements. 

Definition 1. Given a signature S = Ui>o define the extended signa- 

ture 

= {/^ \ f G S A p G i7arity(/)} 
where iT„ is the group of permutations over n elements. 

For any G Sn> a'nd its corresponding f G S, we define arity(/^) = arity(/). 
A sequence s G is said to be a traversal sequence of a term t G T{S) if: 

1. t G So, and s = t (the permutation is omitted for first-order constants); or 

2. t = f(ti,...,tn), for any i G [l..n], there exists a sequence Si such that it is 
a traversal sequences ofU, and there exists a permutation p G Tin such that 

S f^Sp^i'f ■ ■ ■ Spf^n) ■ 

Definition 2. Given a sequence of symbols Oi • • • a„ G (Sjj)*, we define its 
width as 

width(a) = arity(a) — 1 

width(ai • • • a„) = width(oi) 

This definition can be used to characterize traversal sequences. 

Lemma 3. A sequence of symbols ai • • • a„ G (Sjj)* is a traversal sequence, of 
some term t G T{S), if, and only if, 

width(ai • • • On) = — 1, and 

width(ai • • • Oi) >0, for any i G [l..n — 1]. 

Now we define the rank of a traversal sequence, and by extension, the rank of 
a term as the minimal rank of its traversal sequences. This definition coincides 
with the definition given in the introduction for the rank of a term for binary 
signatures. 

Definition 4. Given a sequence of symbols ai • • • a„ G {SnY , we define its rank 
as 

rank(oi • • • a„) = max{width(ai • • • aj) | i,jG [l..n]} 

Given a term t G T{S), we define its rank as 

rank(t) = min{rank(r(;) \ w is a traversal of t} 
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f g h c d b a 



f a g h c d b 



f a g b h c d 



Fig. 2. Representations of the function /(i) = width(ai • • • Oi), for some traversal se- 
quences of f{a,g{b,h{c,d))). 

In general, a term has more than one traversal sequence associated. The rank 
of the term is always smaller or equal to the rank of its traversals, and for at 
least one of them we have equality. These rank-minimal traversals are relevant 
for us, and we choose one of them as the normal traversal sequence. In figure |21 
the third traversal sequence cd is the normal one. 

Definition 5. Given a term t, its normal traversal sequence NF(t) is defined 
recursively as follows: 



Then,W{t) = fP NF(tp(i)) ••• NF(tp(„)). 

Lemma 6. For any term, its normal traversal sequence has minimal rank, i.e. 
rank(t) = rank(NF(f)). 

Rank-upper bounded traversal sequences define a regular language. The con- 
struction of associated automata can be found in pdioH] - 

Lemma 7. Given an extended signature Sn and a constant k, the following set 
is a regular language. 



1. If t = a then NF(t) = a. 

2. If t = f(fi,...,tn) then let p € Iln be the permutation satisfying 




R% = {s S i^n)* I rank(s) < k A s is a traversal} 
Proof. We can define inductively as follows: 



R% = {E,r 

R% = U ( 



U( U R%-^+^ ■ ■ ■ R%-^)* Eo 



0 



n>l 
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5 Traversal Equations 

In this section we introduce traversal equations. Solvability of traversal equations 
is still an open question, but we prove that a variant of them (the so called rank- 
and permutation-hounded traversal equations) can be reduced to word equa- 
tions with regular constraints which are decidable. This reduction is 

somehow inspired in the reduction from trace equations to word equations used 
in [II ) M M hTj to prove decidability of trace equations. Later in Section 0 we 
will reduce context unification to solvability of traversal equations. We need the 
rank-bound conjecture to prove that the reduction can be done to rank- and 
permutation-bounded traversal equations. 

Definition 8. A traversal system over an extended signature with word vari- 
ables is a conjunction of literals, where every literal has the form 

w\ = W2 (word equation), w\ = W2 (traversal equation) or w G R (regular 
constraint), being Wi G {Sn U W)* words with variables and R C {Sn)* a 
regular language. 

A solution of a traversal system is a word substitution cr : W — ?> i^n)* such that 

1 . a{w\) — a{w2) for any word equation wi = W2, 

2 . a{wi) and (j{w 2) are both traversal sequences of the same term, for any 
traversal equation wi =W2, 

3 . and a{w) belongs to R, for any regular constraint w G R. 

Definition 9. A traversal system is said to be rank-bounded if, for every traver- 
sal equation W\ = W2, there exist two constants k\ and k2, and two regular con- 
straints w\ G R^ and W2 G R^ in the system, where R\; is the (regular) set of 
k-hounded traversal sequences. 

We can transform rank-bounded traversal systems into equivalent traversal 
systems using the following transformation rules. 

Definition 10. The following rules define a non- deterministic translation pro- 
cedure from rank-bounded traversal systems into word equations with regular con- 
straints. 

Rule 1: For some n-ary symbol f G S and permutations p\, P2 G 7T„, we replace 
the traversal equation w\ = W2 and the corresponding regular constraints 



W\ G R^ and W2 G by 



wi G 
W 2 G R^i 




where X\, X2 and {T, L"/}ig[i,.n] are fresh word variables. 
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Rule 2: We replace the traversal equation wi = W 2 and the corresponding reg- 
ular constraints w\ € and W 2 € by 

W\ = W2 

w\ G R^^ 

W2 e 

If the rank of a traversal sequence wi ■ ■ ■ Wn is bounded by fci, then, for 
any i G the rank of Wi is bounded by fci — n + i. These are the values of the 

exponents used in the regular restrictions of the right-hand side of Rule 1. Rank- 
boundness is crucial in order to ensure soundness of Rule 2. For instance, the 
traversal equation X aaY = YaaX has no solution, whereas the word equation 
X aaY = Y aaX is solvable. Notice that some substitutions, like X,Y H> a, 
give equal sequences, but they are not traversal sequences. 

Theorem 11. The rules of Definition \Uk describe a sound and complete de- 
cision procedure for rank-bounded traversal systems. In other words, for any 
rank-bounded traversal system S, 

1. if S S' and the substitution a is a solution of S' , then a is also a 

solution of S, and 

2. if the substitution a is a solution of S, then there exists a word unification 
problem with regular constraints S' , a transformation sequence S =>* S' , 
and an extension o' of a, such that a' is a solution of S' . 

Unfortunately, this nondeterministic transformation procedure does not al- 
ways terminate. Notice that we can have pi{n) = P 2 {n) = r, and in such case 
we obtain a traversal equation Y^ = Yf with the same bounds Y^ G R^ and 
Yf G R^ as the original one. However, these transformation rules can be used 
to find solutions a of equations wi = W 2 , such that cr{wi) and a{w 2 ) are traver- 
sal sequences for the same term, and they are “similar”, where “similar” means 
that they only differ in a bounded number of permutations. 

Definition 12. Given two traversal sequences v and w over Ejj, we say that 
they differ in n permutations if, either 

1. v = fP ri ■ ■ ■ rm and w = f s\ - ■ ■ Sm, for any i G and Si differ in 

Ui permutations, and YllLi 

2. V = fP Tp(i) • • • and w = f 5.^(1) • • ’ Srfm), where p t, for any i G 

ri and st differ in Ui permutations, and ni = n — 1. 

Definition 13. A permutation-bounded traversal equation, noted Wi =k W 2 , is 
a tuple of two words with variables Wi and W 2 , and an integer k. 

A substitution a is said to be a solution of a permutation-bounded traversal 
equation w\ =k W 2 if a{w\) and a{w 2 ) are both traversal sequences of the same 
term, and they only differ in at most k permutations. 

A permutation- and rank-bounded traversal system is a rank-bounded traver- 
sal system where all traversal equations are permutation-bounded. 



Wi = W2 

nn c nmin{fel,fe2} 

W\ riy 
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Theorem 14. Solvability of permutation- and rank-bounded traversal systems 
is decidable. 

Proof. We can reduce the problem to an equivalent word unification prob- 
lem with regular constraints using a variant of the rules of Definition for 
permutation-bounded equations, finitely many times. 

When we apply Rule 1 with pi = p 2 , we transform wi =k W 2 into {Yi =k- 
where We can require the existence of i,j € [l-.n], such 

that i j, ki ^ 0 and kj ^ 0 without loosing completeness. When we apply 
this rule with pi p 2 , we transform wi =k W 2 into {Yi =k^ R/}iG[i..ra] where 

= fc - 1- 

Rule 2 can be applied to transform wi =k W 2 into w\ = W 2 , for any k. 

It is easy to prove that this transformation process always terminates using 
a multiset ordering on the multisets of bounds of the traversal equations. 

6 The Rank-Bound Conjecture 

In this section we introduce the rank-bound conjecture. This is the base of the 
reduction of context unification to permutation- and rank-bounded traversal 
systems that we describe in the next section. As we will see, this conjecture 
is essential in order to prove that the traversal equations that we find in the 
reduction are both permutation-bounded and rank-bounded. 

Conjecture 15 (Rank-Bound Conjecture). There exists a computable func- 
tion such that, for any solvable context unification problem t = u there exists 
a ground unifier a satisfying 

rank(cr(t)) < <P{size{t = u)) 

The validity of the conjecture is still an open question. In fact, we think that 
the conjecture is true, not only for just one ground unifier, but for any most 
general unifier. This stronger version of the conjecture is not true for second- 
order unification, because we can have most general second-order unifiers with 
arbitrarily large rank, as the following example shows. 

Example 16. The second-order unification problem 

F{f{a,a)) = f{F{a),F{a)) 

has only one context unifier a = [F Xx . x\. However, it has infinitely many 
second-order unifiers which are not context unifiers, like 

cr = [F Xx . f{f{f{x,x),f{x,x)),f{f{x,x),f{x,x)))\ 

For any n > 0, there is a second-order unifier where bound variable x occurs 
2" many times in the body of the function, and the rank of a{F{f{a,a))) is 
equal to n-\- 1. This term a{F{f{a, a))) can be represented as follows for n = oo. 
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In the following Lemma we prove that the conjecture is true for first-order 
unification. 

Lemma 17. Given a solvable first-order unification problem t = u, its m.g.u. a 
satisfies 

rank(cr(t)) < size(t) -I- size(tt) 

Proof. Suppose we have an unification problem t = u like 

g{f{a, b),f{X, X)J{Y, Y)) = g{X, Y, Z) 



We can represent it by a directed acyclic graph (DAG) where we have two 
initial nodes (one for each side of the equation), and a unique node per variable. 
We can solve the unification problem by re-addressing the arrows pointing to 
a variable, when this variable is instantiated. Therefore we can represent a(t) 
by means of a DAG D, where size(D) < size(t) -I- size(s), being the size of a 
DAG its number of arrows. This is the representation of the DAG corresponding 
to our example (where, for simplicity, we have added a thick arrow instead of 
re-addressing arrows pointing to variables): 




For any labelling of the original DAG, the same labels in the DAG resulting 
from instantiation represent a traversal sequence of a{t) and a traversal sequence 
of (j(m). Defining the rank of a node as the addition of the label in the path from 
the root to this node, the rank of the traversal sequence will be the maximal of 
the rank of all leaves. In our example, this rank is 5 and it is obtained from the 
following path 



f^f^f 



The rank of a path never exceeds the number of arrows of the DAG, i.e. its 
size, because, to avoid occur check, we can not repeat nodes in a path. Therefore, 
when we use an arrow with label n, there are at least n other arrows (the ones 
with the same origin) that can not be contained in the same path. We can 
conclude that the traversal sequence of a{t) represented in the path satisfies 
rank(s) < size(t) -I- size(rt), thus rank(a(t)) < size(t) -I- size(u). 
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7 Reducing Context Unification to Traversal Equations 

In this section we prove that context unification can be reduced to solvability 
of traversal systems. Moreover, we also prove that if the rank-bound conjecture 
is true, then this reduction can be done to permutation- and rank-bounded 
traversal systems. Therefore, if the conjecture is true, then context unification 
is decidable. 

The reduction is very similar to the naive reduction described in Section El 
First-order variables X are encoded as word variables X' such that, if cr is a 
solution of the context unification problem, and a' is the corresponding solution 
of the equivalent word unification problem, then cr'(X') = NF(cr(X)). 

For every n-ary context variable F, we would need n -I- 1 word variables 
Fq,...,F^, such that a'{FQaF{ a - ■■ F^_^ aF^) = XF{a{F{a,..., a))). However, this 
simple translation does not work. If a term t contains two occurrences of a 
first-order variable X, then NF(cr(t)) will contain two occurrences of NF((t(X)). 
However, two different occurrences of a context variable can have different argu- 
ments, and this means that the context cr(F) can be traversed in different ways, 
depending on the arguments. Notice that, in general, even if NF(t[a]) = wq awi, 
we can have NF(t[u]) wq NF(m)u'i. Fortunately, the different ways in which 
the occurrences of (t{F) are traversed in the normal form of a{t) are not very 
different, i.e. they differ in at most a bounded number of permutations. 

Example 18. Let <t{F) = Xx . f{f{x,ti),t2), where rank(ti) < rank(t2), and 
Wi = NF(ti), for i = 1, 2. Depending on the argument u, we have 

{ j[i. 2 ] ]Slp(cr(M)) wi W2 if rank(cr(M)) < rank(ti) 

Wi NF(a(u)) u>2 if rank(ti) < rank(cr(M)) < rank(t2) 

W2 wi NF((t(m)) if rank(t2) < rank(cr(M)) 

For any u and u', NF{a{F{u))) and NF((t(F(u'))) only differ in at most 2 per- 
mutations. 

Lemma 19. Let F be a eontext variable and a a substitution. For any two 
terms F{ti,...,tn) and F{ui,...,Un), there exist sequences VQ,...,VmWo,...,Wn and 
permutations p,T G Fin, such that 

NF(cr(F(ti,..., tn))) = Vo NF(cr(tp(i))) • • • Vn-i NF(cr(tp(„))) 
NF(cr(F(ui,..., M„))) = Wo NF(cr(w^(i))) • • • Wn-l W{a{Ur(n))) Wn 

and, for any sequence of constants 



^0 ^p(l) * * * ^n— 1 ^p(n) '^n 

Wo ar(l) Wi--‘ Wn-l ar(n) 

are both traversal sequences o/ cr(F(ai,..., o„)), and they only differ in at most 
n ■ rank(cr(F(ai,..., an))) permutations. 
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Notice that we need the rank-bound conjecture in order to bound the value 
of rank(cr(F(ai,..., a„))), i.e. to prove that these two traversal sequences differ 
in a bounded number of permutations. 

In the rest we describe how a context unification problem could be effectively 
translated into an equivalent system of traversal equations. 

Theorem 20. Context unification can he reduced to solvability of traversal sys- 
tems. 

If the Rank-Bound Conjecture is true, then context unification can he reduced to 
solvability of permutation- and rank-bounded traversal systems. 

Proof. Let t = uhe the original context unification problem, and (E, X) be the 
original signature. We assume that E is finite, and contains at least 2 • n distinct 
first-order constants oi,..., o„, 6i,..., 6„, where n = max{arity(F) | F G Var(t = 
u)}, and a binary symbol /, and that oi,..., a„, 5i,..., do not occur in t = u. 
Therefore, if a problem is solvable, it has a ground unifier. 

First step. The order of the arguments in F and in a{F) are not necessarily the 
same. In this first step we guess a permutation pp & .^arity(F) for any context 
variable and transform t = u into (Jo(t) = cto(m) where 

CTo = IJ [F 1-^ Xxi---Xn-F'{Xpp(i),...,Xpp(ri))] 

F^Yar{t^u) 

Now, we can assume that F' and its instance have the arguments in the same 
order. Moreover, as far as (Tq is simply a renaming substitution, t = u and 
(To(f) — o'o(w) are equivalent problems. 

Second step. We introduce a word variable X' G W for every first order variable 
X G X, and arity(F) -|- 1 many word variables Fq ,..., G W for every 

occurrence p of a context variable F in the problem (notice that in this case we 
use different word variables for every occurrence). 

We guess a permutation pp for any occurrence of a constant function / or of 
a context variable F, with arity greater or equal than two, in a position p of the 
problem. 

We define the following translating function F that given a subterm t G 
F{E,X) of the problem, and its position p, returns its translation in terms of 
words with variables w G {En U W)*. 

For any first-order constant a, or variable X, 

T{a,p) = a 
T{X,p)=X' 

For every n-ary function symbol /, or context variable F, occurring at po- 
sition p, let Wi = T{ti, p ■ i), and pp be the permutation conjectured for this 
position, then 

T{f{ti,...,tn),p) = 

T(i^(ti,...,t„),p) = FPwp^^i)Ff ■■■FFiWp^(^)FP 
Finally, the traversal system will contain the following equations: 
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1. A word equation for the original problem t = u 

r{t,i) = T{u,2) 

2a. For any two occurrences F(ti,...,tn) and F{u\,...,Un) of a context variable 
F at positions p and q, we introduce the following traversal equations and 
regular constraints 0 



T{F{ai,.. 


.,an),p) =k T{F{ai,.. 


■,an),q) T{F{bi,.. 


p) T{F(b 


T{F{ai,.. 


■ ,an),p) £ Rsjj 


T{F{bi,.. 


•,fen),p)£ 


T{F{ai,.. 


■ ,an),q) £ Rsn 


T{F{bi,.. 


:,b^),q) £ 



where k = arity(F’) • ^(size(t = u)) 
k\ = k 2 = ^(size(t = u)) 

and is the computable function introduced in the rank-bound conjecture. 
2b. In case we want to reduce context unification to (non-bounded) traversal 
systems, we will introduce 

T{F{ai,...,an),p) = T(F(ai,..., a„), g) 

T{F{bi,...,bn),p) = T(F{bi,...,bn),q) 

In this second case, we do not need the conjecture to fix k, k\ and k 2 - 

The duplication of traversal equations with distinct constants ai and bi en- 
sures that these constants occur in the place of the arguments. Otherwise, if 
we only introduce a traversal equation Xq a Xi = Xq a X'-y , we can get so- 
lutions like a = [Aq !->■ a][Xi >->• A][Aq h> >->• a], that do 

not satisfy a(Xob Xi) = (t(Aq 6A(), and leads to incompatible definitions of 
cr{F) = \x . f{a, x) and cr{F) = \x . f{x, a). 

Corollary 21. If the Rank-Bound Conjecture is true, then Context Unification 
is decidable. 

Example 22. To conclude, let’s see how problem X{Y{a, b)) = Y{X{a),b) could 
be translated into a traversal system. 

We guess cto equals to identity in the first step. In second step, we in- 
troduce the word variables Xq, Xi, Xq, X[ for the two occurrences of X, and 
YQ,Yi,Y 2 ,YQ,Y(,Yf for Y. For both occurrences of Y, the only symbol with 
arity 2 or greater, we guess the same permutation pi.i = p 2 = [2, 1]. 

The translation of the unification problem results then into: 

XQYQbYiaY2Xi^Yf,bY(X'QaX[Yf 



AoaiAi =fc A'oiA; Ap &i Ai =fe A' 6i A( 

Ao ai Ai e Ao bi Ai e 

X'o ai e A' b, X[ e R%^ 

^ We can avoid to introduce a context variable occurrence in more than two traversal 
equation. If we have pi,...,p„ occurrences of F, we can introduce an equation relating 
Pi and p2, P2 and ps,..., p„_i and p„. 
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Fo 02 Yi Oi F 2 = 2 fc Y' 02 Y( oi Y' Fq &2 Y^ h Y^ =2k Y^ Y{ Y' 

Fo 02 Fi oi F2 G Fo &2 Fi bi F2 G 

Fo' 02 Y{ oi Y^ G Fo' 62 n G 

where k = ^(8), and ^ is the function introduced by the rank-bound conjecture. 

8 Conclusions and Further Work 

In this paper we prove that, if the rank-bound conjecture is true, then context 
unification is decidable. The decidability of context unification is still an open 
question, and a positive answer would have important implications in very dif- 
ferent research areas. Additionally, we define traversal equations and rank- and 
permutation-hounded traversal equations, and prove that solvability of the second 
ones is decidable. 

We are currently trying to prove the rank-bound conjecture, and finding a re- 
duction from traversal equations to context unification, to prove the equivalence 
of both problems. 
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Abstract. A new class of tree-tuple languages is introduced: the weakly 
regular relations. It is an extension of the regular case (regular relations) 
and a restriction of tree-tuple synchronized languages, that has all usual 
nice properties, except closure under complement. Two applications are 
presented: to unihcation modulo a rewrite system, and to one-step rewrit- 
ing. 



1 Introduction 

Several classes of tree-tuple languages (also viewed as tree relations), have been 
defined by means of automata or grammars. In particular, a simple one, the 
Regular Relations (RR), consists in defining regularity as being the one of the 
tree language (over the product-alphabet) obtained by overlapping the tuple 
components. 

A more sophisticated class, the Tree- Tuple Synchronized Languages (TTSL), 
is obtained by extending RRs thanks to synchronization constraints between in- 
dependent branches. TTSLs have first been introduced by means of Tree-Tuple 
Synchronized Grammars (TTSG) and have been applied to equational unifica- 
tion [3, to logic program validation m. and to one-step rewriting theory |S|. 
They have next been reformulated in a simpler way by means of Gonstraint 
Systems (GS), and applied to rewritiM again and to concurrency 0. 

RRs have all usual nice propertieqj, but expressiveness is poor. It is just the 
opposite for TTSLs. In particular, CSs are not closed under intersectioifl nor 
under complement. 

In this paper we define an intermediate class, called Weakly Regular Relations 
(WRR), between RRs and synchronized languages, that has all RRs properties, 
except closure under complement. Thanks to its expressiveness greater than RRs, 
WRRs enable to prove new decidability results: 

— on unification modulo a rewrite system. Unlike 0, a non-linear goal is al- 
lowed. This result is obtained by using the general method of m for deciding 
unifiability with the help of a tree-tuple language. 

^ Except closure under iteration (transitive closure). 

^ Horizontal TTSGs are closed under intersection, provided a more complicated control 
is used 0, which makes great difficulties. In particular, we do not know if they are 
then still closed under projection. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 1 8.S- OT1 2001. 

© Springer- Verlag Berlin Heidelberg 2001 



186 



S. Limet, P. Rety, and H. Seidl 



— on the existential one-step rewriting theory. Unlike |H|, non-linear rewrite 
rules are allowed. 

All missing proofs and details can be found in 0 

2 Regular Relations and Synchronized Languages 

Let S be an alphabet, i.e. a set of symbols with fixed arities. denotes the set 
of ground terms over S. For a position p in t t\p denotes the subterm of t 

at position p, and t{p) denotes the symbol occurring in t at position p. 

Given a language S of /-tuples, and T of n-tuples, and for i G 
j G {1, . . . ,n} the i,j-join of S and T is a, I + n — 1-tuple language, denoted 
S j T, and defined by: 

I (si,...,s/) G S A (ti , . . . , tyi) G TASi — tj } 

S T stands for S M; ^ T. For A, • • ■ Ufc € {1, . . . , /}, the projection of S on 
components i\, . . . ,ik is the fc-tuple language, denoted (S'), defined by: 

(S) = {(Sii J ■ ■ ■ ! I Vj y/: li, . . . , 3sj, (si, . . . , S;) G S} 

For tuples s € S, t G T, st denotes the I + n-tuple obtained by concatenation of 
s and t. 



2.1 Regular Relations 



We define regularity for n-ary relations as in |2|, i.e. a relation R is regular iff 
the tree language over the product-alphabet obtained by overlapping the tuple- 
components is regular. 

Formally, let be the product alphabet defined by = (AU{_L}) x (UU 
{_L}) — {-L_L} where _L is a new constant. For s,t G we recursively define 
s © t G by: 



f{si,. . . ,Sn)®g{ti , . . . , tjfi ) 



f 0 tl, . . . , Syy 0 tyy, _L 0 . . . , _L 0 tyyy ) 

if n < m 

f ® hj • ■ ■ ; 0 fmi ^m+1 0 -L, ■ - ■ , 0 -L) 

otherwise 



For instance f{a,g{b)) © f{f{a,a),b) is ff{af{-La,-La),gb{b±)). This definition 
trivially extends to product of k terms G 0 . . . © tfc. 

A n-ary relation i? C T|) is regular iff the tree language {G © . . . © | 

(ti, . . . , tn) G i?} is regular. RR stands for Regular Relation. For example, {(t, t) \ 
t G Ts} and for given symbols f,g G E of same arity, {{t,t[f ^ g]) \ t G T^} 
are RRs. On the other hand {(t,Gym) | t G T^, Gym is the symmetric tree of t} 
is not a RR if E is not monadic. 
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2.2 Constraint Systems for Synchronized Languages 

Synchronized languages refer to the class of languages defined by means of TTSG 
with bounded synchronizations of 0 . The exact definition is rather technical and 
will not be given here. The aim of this section is to define a more uniform way 
to recognize this class of languages. We use constraint systems (CS), and we 
consider their least fix-point solutions. A CS can be viewed as a grammar for 
which a bottom-up point of view is adopted. We will sometimes identify non- 
terminals with the sets they generate. 

Example 1. In the signature E = {/^^, 6^°} let Id 2 = \ t € T^} be the set 

of pairs of identical terms. Id 2 can be defined by the following CS: 

Id2 3 (6, b) 

Id2 2 (/(ll,2i), /(l2,22)) {Id2,Id2) 

where 12,22 abbreviate pairs (for readability). For example 2\ means 

(2,1), which denotes the first component of the second argument (the second 
Id 2 )- Note that since li and I 2 come from the same Id 2 , they represent two 
identical terms, in other words they are linked {synchronized) , whereas for ex- 
ample li and 2 1 are independent. 



Example 2. Now if we consider the slightly different CS : 

Esym (^, h) 

Xsym 2 (/(ll,2i), /(22,l2)) {X,ym,X,ym) 
we get the set Lsym = {(LCym) | Cym is the symmetric tree of t\. 

Example 3. In the signature S = let L^ue = {(s”(&), s^"(6))}. It can 

be defined by the CS : 



Xdble 12 {b, b) 

Xdble 12 (s(ll), 5 ( 5 ( 12 ))) Xdble 

General Formalization 

Assume we are given a (universal) index set N for tuple components. For I Q N 
and any set M, the set of /-tuples a : / — >■ M is denoted by . Often, we 
also write a = {ai)i^i provided a{i) = Oi which for / = {1, . . . , /} C N, is also 
written as a = (oi, . . . , Ofc). 

Different tree tuple languages may refer to tuples of different length, or, 
to different index sets. Our constraint variables represent tree tuple languages. 
Consequently, they have to be equipped with the intended index set. Such an 
assignment is called classification. Accordingly, a classified set of tuple variables 
(over N) is a pair (ff, p) where p : X ^ 2^ assigns to each variable X a subset 
of indices. This subset is called the class of X. For convenience and whenever p 



188 



S. Limet, P. Rety, and H. Seidl 



is understood, we omit p and denote the classified set p) by X. The maximal 

cardinality of the classes in X is also called the width of X. In particular, in 
example^ = {1)2}, X = { 1 ^ 2 }) p{Id 2 ) = {1>2}, and the width of X is 2. 

A constraint system (CS) for tree tuple languages consists of a classified set 
{X, p) of constraint variables, together with a finite set £ of inequations of the 
form 



AD □(Ai,...,Afc) (1) 

where X, Xi , . . . , Xk G X and □ is an operator mapping the concatenation of 
tuples for the variables Xi to tuples for X. More precisely, let 

J = |(i,a:) I 1 < i < fc,x G p{Xi)} (2) 

denote the disjoint union of the index sets corresponding to the variables Xi (in 
example 0 J = {(1,1), (1,2), (2,1), (2,2)} abbreviated into {li, I 2 , 2i, 22 }). 
Then □ denotes a mapping Tjl — >■ Each component of this mapping 

is specified through a tree expression t which may access the components of 
the argument tuple and apply constructors from signature S. Thus, t can be 
represented as an element of Ts{J) where Ts{J) denotes all trees over E which 
additionally may contain nullary symbols from the index set J. 

Consider, e.g., the second constraint in exampleD There, the first component 
of the operator is given by f = /(li, 2i). 

The mapping induced by such a tree t then is defined by 

for every {sj)j(=j G Tjl. Accordingly, □ is given by a tuple □ G ■ 

Let us collect a set of useful special forms of constraint systems. The con- 
straint (1) is called 

— non- copying iff no index j £ J occurs twice in □; 

— horizontal iff for any components LIy and any positions p, q, £^x\p = 
{i,x'), Dylg = {i,y') implies p and q are positions at the same depth. 

— regular iff each component of □ is a single constructor application of the 
form Ox = a 2 ,((l, a;), . . . , (n, a;)) for each x G p{X) (gx being of arity n). 
Note that regularity implies horizontality. 

The whole constraint system is called non-copying, (horizontal, regular) iff each 
constraint in £ is so. 

For example, the CSs that define 1^2 and Lsym are non-copying, and hori- 
zontal. Moreover Id 2 is regular. On the other hand, Ldbie is not horizontal. 

The class of non-copying CSs corresponds to the class of TTSGs of |7|, so 
recognizes synchronized languages. In the rest of the paper, we only consider 
non-copying CSs even it is not explicitly written. 

Proposition 1. The class of non-copying CSs is closed under union, pro- 
jection, cartesian product. Moreover membership and emptiness are decidable. 
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Proposition 2. RRs are exactly the languages defined by regular CSs. 

Proof. If top symbols of all components of □ have the same arity, i.e. yx,y G 
p{X), arity(ax) = arity(ay), then it is obvious since overlapping tuple-compo- 
nents amounts to synchronize identical positions together. Otherwise, a non- 
regular CS may define a RR, like for example {X O (s(li), /(I2, 2i))(X, y), Xf) 
(6, b), y O 6}. But it can be transformed into a regular CS by changing the index 
set of y into p(V) = {2}. The CS becomes {X O (s(li), /(I2, 22))(Ai, y), X O 
(6,6), y O 6}, which is regular. However, for simplicity, we always consider in 
examples that for any X, p{X) = {1,2 ,.. .}. 

A variable assignment is a mapping cr assigning to each variable X G X a, 
subset of (i.e. a set of tuples of ground terms). Variable assignment cr 

satisfies the constraint (1) iff 

(j{X)^{U{ti,...,tk)\UGu{Xi)} (3) 

Note that the D-operator is applied to the cartesian product of the argument sets 
a{Xi). In particular, the tuples inside the argument sets are kept “synchronized” 
while tuples from different argument sets may be arbitrarily combined. 

The variable assignment cr is a solution of the constraint system iff it satisfies 
all constraints in the system. Since, the operator application in (3) is monotonic, 
even continuous (w.r.t. set inclusion of tuple languages) we conclude that each 
constraint system has a unique least solution. 



3 Study of Intersection of Synchronized Languages 

Finding a CS that recognizes the intersection of two synchronized languages is 
difficult, precisely because of synchronizations. The first example exhibits the 
first difficulty: deadlocks. 

Example 4- Let Li and L 2 recognized respectively by the variables Xi and X 2 
of the following constraint systems: 





L 2 


Xx 3 (li,g(l 2 ))Vl 

Mi3(/(li),/(l2))Zi 

Zx 2 {g{b),b) 


X 2 2 (/(ll),l2)P2 
Y2^{g{lx),g{l2))Z2 
Z2 2{b,m) 



Clearly Li = L2 = |(/(5(6)), 5(/(6)))}, So Lx fl L2 is not empty but in Lx 
occurrences 1 (in the first component) and 2.1 (in the second component) are 
synchronized whereas occurrences 2 and 1.1 are synchronized in ^2- This pro- 
duces a deadlock when trying to run the two CSs in parallel to recognize Li nL2- 

More generally, it is possible to encode the Post Correspondence Problem by 
testing emptiness of the intersection of two synchronized languages. 

Theorem 1. Emptiness of the intersection of synchronized languages is unde- 
cidable. 
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Proof. Let A and C be two alphabets and (p, p' be morphisms from A* to C* . 
Let us consider the two synchronized tree languages L = {(a,i/)(a))| a G A^} 
and L' = {{a,p'{a))\a G Let us write p{ai) = and p'iaf) = 

c ' 1 ■ ■ - c'-p, for each Ui G A. Then L and L' are recognized by the following CSfl 



L 


L’ 




■ ■c^,p, {l2))X 


^'3(a.(li),c',i. 


■■c',p,a2))X' 


Vcij € -A 




■ Ci^p- (T)) 


X'D(a,(T),c',i. 




Vcij € -A 



So deciding whether L (1 L' is empty amounts to decide the existence of 
a G A+ such that p{a) = p'{a), i.e. solving the Post correspondence problem. 



Therefore, since emptiness of synchronized languages is decidable P], this 
class is not effectively closed under intersection. 

Considering the two previous examples, it seems unavoidable to consider a 
subclass of synchronized languages to get closure under intersection. The first 
idea is to avoid deadlocks by forbidding leaning synchronizations (i.e. imposing 
that synchronization points are always at the same depth) : from now we consider 
only horizontal CSs. 

Example 5. Consider again languages Id ,2 and Lgym as defined in Examples 0 
and0 So Id2C\Lsym is the set of pairs of terms (t, t') such that t = t' and t is the 
symmetric tree of t' . This means that t is a self-symmetric tree. So Id ,2 H Lgym 
is the set {(6,6)} U {{f{t,tsym),f{t,tsym))} where tsym denotes the symmetric 
tree of t. Let X = {(t, tsymj t, ^sym)}) Id2 n Lsym can be defined by 

Id2nL,ym 3 (6,6)|(/(li,l2),/(l3,l4))^ 

X D (/(li, 2i), /( 22 , 12 ), /(I3, 23 ), /( 24 , l4))(^, x)\{b, 6, 6, 6) 

It seems that horizontality is a good criterion to get closure under intersec- 
tion, but unfortunately it is not enough as shown by the following example. 

Example 6. Let Lpf, be the language defined by the following constraints: 

(/(li,l2),/(2i,22))(/d2,M Lp,2(b,b) 

The following picture gives an intuition on how Lpi, looks like: 




T just marks word ends. 



3 
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The intersection of Lpt, with Id .2 gives the language of pairs of identical 
balanced terms (i.e. terms all branches of which have the same length) . To express 
it with a non-copying CS, we have to synchronize all occurrences of the same 
depth together, which requires wider and wider tuples, then infinitely many 
intermediate languages. This is impossible because a CS is always supposed to 
be finite. That is why the definition of WRRs needs restrictions stronger than 
horizontality. 



4 Weakly Regular Relations 

A horizontal constraint system C = (X,£) is weakly regular iff there is a rank 
function r : T — >■ IN s.t. for every constraint c given by X D □(Xi, . . . , X^): 

- r(X) > r(Xi) -I- ... -I- r(Xfc); and 

— r{X) = r(Xi) -I- ... -I- r(Xfc) only provided c is regular. 

In essence, this definition implies that each tree tuple of any variable X in 
the system is constructed by using at most r(X) horizontal but non-regular 
constraints. 

Accordingly, a (set of tree tuples or a) tree relation is called weakly regular 
( WRR stands for Weakly Regular Relation) iff it is defined by a weakly regular 
constraint system. 

Example 1. Let X = {/^^, a^°}, consider the rewrite system R = {f{x, y) — >■ 

g{y,x)}, and let S = {(^ 1 ,^ 2 ) I ti ^ 2 } = {C[f{x,y)a],C[g{y,x)a]}. S' is a 
WRR: 

S D (/(li, 2i), /(I 2 , 22))(S, M 2 ) S 2 2i), g(l 2 , 22))(S, M 2 ) 

SD (/(li,2i),/(l2,22))(/d2,S) SD (g(li,2i),g(l2,22))(W2,S) 

S 2 (/(li) 2i), g(22, l2)){Id2, M 2 ) 

M 2 2 (/(li, 2i), /(I 2 , 22 ))(/d 2 , ^^ 2 ) Id 2 2 (ff(li,2i),5(l2,22))(/fi2,d(i2) 
with r{Id, 2 ) = 0, r(S) = 1. 



Fact 1 —If the constraint system C is weakly regular, then it is weakly regular 
for a rank function with maximal rank 2l‘"l (where \C\ denotes the number of 
variables ofC. 

— It can be decided in linear time whether or not a constraint system is weakly 
regular. 

There is another structural characterization of weakly regular constraint sys- 
tems. For variable X, let us denote by Cx the restriction of the constraint 
system C to all variables possibly influencing X. Let us call a constraint 
X 2 LI(Xi, . . . , Xfc) recursive iff some variable Xj on the right-hand side de- 
pends on X, i.e., X occurs in Cxj- Then we have: 
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Theorem 2. A horizontal constraint system C is weakly regular iff every recur- 
sive constraint X D D(Xi, . . . , X^) is regular where furthermore the constraint 
systems Cxi are regular for all i except at most one. 

Example 0 defines a WRR indeed, since every recursive rule is regular and 
7^2 is a RR. On the other hand, Lgy^ (Example |2) is not a WRR. 

Our main new result is: 

Theorem 3. Weakly regular relations are closed under union, projection, carte- 
sian product and intersection. 

The first three closure properties of WRRs stated in theorem 0 are obtained 
simply by considering the corresponding constructions for arbitrary tree tuple 
constraint systems and verifying that these preserve the WRR property. In the 
sequel, we therefore concentrate on the proof of closure under intersection. We 
need the following auxiliary notion. 

A constraint system C is single-permuting, iff every non-regular constraint of 
C has only one variable on the right-hand side, i.e., is of the form 

A A □(Xi) 

The proof of theorem 0 is based onto the following two auxiliary lemmas: 

Lemma 1. Let C be a weakly regular constraint system. Then an equivalent 
(up to further auxiliary variables) weakly regular constraint system C can be 
constructed which is single-permuting. 

If C has maximal rank r, maximal size of classes d, at most a variables 
in right-hand sides and size n, then C has size ) and classes of size 

0{d- a’') where neither the rank nor the number of variables in right-hand sides 
has increased. Moreover, the constraint system C can be constructed in double- 
exponential time. 



Lemma 2. Assume that the tree-tuple language L of class I is defined by a 
single-permuting constraint system C and I x I is an equivalence relation. 
Then a single-permuting constraint system C can be constructed for the language 



= {t€ L \ y{i,j) ti = tj} 



In particular, if C was weakly regular, then so is C . If C is of size n and has 
classes of size at most d, then C can be constructed in time ■ n. 

Using lemmas 0 and 0 the proof of theorem 0 proceeds as follows. Assume 
we are given languages L\ and L 2 both defined by weakly regular constraint 
systems Ct, i = 1,2. By closure under cartesian product, the language L = 

{t(i)i(2) I iW e 

is defined by a weakly regular constraint system C. By 
lemma 0 we also replace C with a single-permuting (weakly regular) constraint 
system C . Now consider the equivalence relation ~ on the index set of L which 
equates the corresponding components from L\ and L 2 . By lemma 0 we can 
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construct from C a constraint system C" which defines . Finally, we may 
(due to closure under projection) construct a weakly regular constraint system 
for the language 

I g = 7,^ n L 2 

and we are done. Calculating the costs of the individual construction steps, we 
furthermore find that the overall construction can be implemented in double- 
exponential time. 

As an immediate corollary of theorem El we obtain: 

Corollary 1. Weakly regular relations are closed under joins. 

Proof. Assume Li (resp. L 2 ) is a Ftuple (resp. n-tuple) language. 

Li ixlij- L2 = X L2) 

(ti , . . . 5 5 ■ • ■ 5 tp , . . . , , . . . , t/_|_yj) I VA:, tii € Fj; } 



Non-closure under Complement 

Consider the balanced full binary trees. Balanced means that all leaves appear 
at the same depth, and it is well known that a full binary tree t is balanced iff 
for all non-leaf position v in t, t\y,\ = t\y, 2 . Therefore t is unbalanced iff there is 
a non-leaf position v and a position u s.t. t(v.l.u) yf t(v.2.u). 

Full binary trees are simulated by terms (with fixed arities) over the sig- 
nature S = Thus the set of unbalanced full binary trees is the set 

of unbalanced terms over S, which is generated by variable V of the following 
constraint system: 



f{U,2^){V,Id) 
V 2 f{h,2i){Id,V) 
V^f{h,l2)U 



f/D (/(li,2i), /(l2,22))([/,/d") 
C/D (/(li,2i), /(l2,22))(/d^C/) 
U D (a, f{Id, Id)) 
U2if{IdJd), a) 



where Id = and Idf = Id x Id.lt is & WRR since Id is regular and the only 
non-regular constraint is R D /(li,l 2 )C/, which is not recursive. 

The complement of unbalanced terms, i.e. the balanced ones, cannot be de- 
fined by a WRR since, as shown in Section El they cannot be defined by a 
non-copying constraint system. 



5 Application to i?-Unification 

This section addresses the problem of unification modulo a confluent constructor- 
based rewrite system. The goal is to decide the existence of data-unifiers (unifi- 
ability), and to express ground data-unifiers by a tree-tuple language. 

Under some restrictions, a decidability result has been established using 
TTSGs 0. Next the method has been generalized HH: any class of tree-tuple 
language can be used, provided: 
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1. it is closed under join, 

2. emptiness is decidable, 

3. it can express the tuple-set Nf = {(r, ax \, . . . , axn) \ f{xi , . . . , Xn) '^*cr] 
and r, axi, . . . , aXn are ground data-terms} for each defined function /, 

4. For each constructor c, it can express the tuple-set 

Nf. = {(c(ti, . . . , t„), ti, . . . , t„) I ti are ground data-terms} 

If in addition it is closed under intersectioifl and projection, the goal to be unified 
may be non-linear. 

The unification method presented in HH has been used with TTSGs and 
with primal grammars jS| providing some decidable subclasses of i?-unification 
problems. In this section, we present the unification method of HH which can 
be used with WRRs getting a new subclass of decidable i?-unification problems. 
The restrictions are those needed for TTSGs, except that the goal and the non- 
recursive rewrite rules may be non-lineai0, additional technical restrictions are 
needed otherwise unifiability is undecidable as shown in j^. 

Let us first recall the principle of the general method, using a simple example: 

Example 8. 



{f{s{x)) ^ p{f{x)), f{p{x)) ^ s{f{x)), /(0)4 0| 

where s,p, 0 are constructors, and consider the linear goal p{f{x)) = /(s(/(x'))). 
We assume that we have a tree-tuple language that satisfies the above properties. 
In particular it can express Nf, Ng, Np. 

The method consists in simulating the innermost narrowing derivations is- 
sued from the goal. We compute: 

N = Np IXI Nf = {(si, S2, t) I (si, S2) G Np A (s2,t) G Nf} 

= {(si, S2, t) I p{f{x)) p{s 2 ) = Sij 

iv' = TV/ XI TVs XI Nf 

= I (s'i>4) G Nf A (4,4) G Ng A G Nf} 

= {( 4 , 4, 4,^0 I fWix'))) /(44)) = /(4) 4 } 

From narrowing properties |2j, there exists a data-unifier iff there exist instances 
t, t' such that Si = s}, i.e. such that N Mi 1 N' 4 Moreover N Mi 1 N' 
expresses the solutions thanks to t, t' . 

Now if the goal to be unified is not linear, like p{f{x)) = f{s{f{x))), t 
must be in addition equal to t' . By projection, we keep only t, t' , si, s(, and 
force equalities by intersection. Thus, there exists a data-unifier iff 7Ti ^{N) fl 
7Ti,4(fV') 4 0- 

For each narrowing step occurring within a narrowing derivation /(x) '^*0.] x 
where r is a ground data-term, either s is added in a and p in r (using ri), or 

This implies the closure under join. 

® WRRs are closed under intersection and projection. 
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the opposite (using r 2 ), and the derivation ends by adding 0 in both (using rs). 
So Nf can be described by the expression: 

= ((P,s) U (s,p))*.(0,0) 



On the other hand 

Ns = {(s(t),t)} = (s,e).((s,s) U (p,p))*.(0,0) 

Np = = (p,e).((s,s) U (p,p))*.(0,0) 

where e is the empty string. Consequently, 

N = (p,e,e).((s,s,p) U (p,p, s))*.(0, 0, 0) 

Nflxi Ns = {p,s,e).{{p,s,s) U (s,p,p))*. (0,0,0) 

N' = {Nf ixi Ns) N Ai/ = {p,s,€,e).{{p,s,s,p) U (s,p,p, s))*.(0, 0, 0, 0) 

ni, 3 {N) nili,4(iv')= (p,e).((s,p) U (p,s))*.(0,0) f]{p,e).{{p,p) U (s, s))*.(0, 0) 

= (p(0),0) 

The solutions are given by the second component. So there is one solution: x/0. 
This is correct: if x/t is a data-unifier of p{f{x)) = f{s{f{x))), necessarily 
= P{f{f{t))), then f{t) = f{f{t)), then f{t) = t because f{f{t)) = t. 
Therefore t = 0. 



Using WRRs 

Nc : unfortunately, the sets Nc = {(c(ti, . . . , t„), ti, . . . , t„)} cannot be ex- 
pressed by WRRs (nor by RRs), because generating two copies of ti needs syn- 
chronization between them, and one occurs on top and the other at depth 1. So 
this language is not horizontal. 

We slightly modify the method so that W is horizontal. Now, using an extra 
symbol t], we define A), = {(c(ti, . . . , t„), t]ti, . . . , \\tn)} for each constructor c, 
and \\L = {{hti , . . . , htn) \ (ti, € L} for any language L. Their constraint 

systems are: 



Nc 2 (c(li, . . . ,ni), t]l 2 , . . . , \\n 2 ){Id 2 , ■ ■ • , 1 ^ 2 ) 

(^li,...,^l„)(L) 

where Id 2 = {{t,t) I t G Tc} and Tq is the set of constructor-terms. These 
constraints are not recursive and M 2 is regular. So Nc is a WRR and if L is 
regular, t]L is a WRR. 

It is however necessary to slightly modify the computation of N and N' . 

Example 9. Consider the previous example again. Now: 

N = Npixi \\Nf = {(ri, 'c^r 2 , \\t) \ (ri, \\r 2 ) € Np A {[\r 2 , \\t) G \\Nf} 

= {{ri, \]T 2 , \]t) I p{f{x)) p{r 2 ) = ri} 
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N' = N f l>^ Ns l>^ [\Nf 

= {(r[,r2,y3>¥) I {r'i,r'2) Nf A (r^,l]r^) € Ns A (t]r^,tlt') G t]iV/} 

= {{r[,r'2M'3,¥) I fWix'))) -P/*'] = nr' 2 ) r{} 

If the goal is linear, we check N IV' 0 as previously. Otherwise we can still 
force t = t' by computing iIi_ 3 ( 7 V) fl ni^i{N'), because the number of t] above t 
in N and t' in N' are the same (one) . This comes from the fact that the number 
of constructors appearing above the two occurrences of x in the goal is the same 
(one). 

Definition 1. For a term t and u G Pos{t), let ||t 6 || denote the number of con- 
structors appearing in t above u. The term t is weak-horizontal if 

\/u,u' G Pos{t), t{u) = t(u') = X G Var(t) => ||u|| = ||t 6 '|| 

In this case we also define ||a:|| = ||m||. The goal t = t' (resp. the rewrite rule 
I ^ r) is weak-horizontal if t = t' (resp. I ^ r) is a weak-horizontal term, 
considering = (resp. -a) is a binary symbol. 



Lemma 3. If the goal is weak-horizontal and x is a variable occurring several 
times in the goal, then every component in N, N' that gives the instances of an 
occurrence of x contains exactly ||a;|| times tf. 

Proof. When using n-ary symbols, the intermediate language corresponding 
to position u {N‘^ = N) is 

iV“ = {{{N, M 2 ,i ^7V“'1) M3,i ^iV“-2) . . . 

if t{u) is a constructor, and 

iV“ = {{{Nf M 24 N^¥ N3_i A^“'2) . . . 7V“-” 

otherwise. If x occurs below u, by induction we get that the number of t| in 
the component giving the instances of x is exactly the number of constructors 
occurring along the path from u to x. 

Thus, if the goal is weak-horizontal the modified method works, provided Nf 
can be expressed. 

Nf : let us explain how to transform a rewrite system into a constraint system 
that expresses the sets Nf. We give an example before the general algorithm. 

Example 10. Consider the rewrite rule f{c{x,y)) -A d{f{y),x). By narrowing, 
we get f{x) '^[a=x/c{x,y)] t' = d{f{y),x). So {t',a) = {d{f{y),x), c{x,y)). To 
get ground data-terms, x should be instantiated by something and f{y) should 
be narrowed further. Then the corresponding constraint is: 



FD (d(li,2i), c(22,l2))(F,/d2) 
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Definition 2. A function position p in a term t is shallow ift\p = /(xi, . . . , x„) 
where Xi, . . . , x„ are variables. 



Definition 3 . Given a rewrite system R, we define an ordering on defined func- 
tion symbols by f > g if /(ti, . . . , tn) r G R and g occurs in r. f = g means 
f > g A g > f . The rewrite rule f(ti, . . . , r G R is recursive if there is a 

function g in r s.t. f = g. 

Algorithm: we assume that function calls in rhss are shallow, and recursive 
rewrite rules are linear. Let us write rewrite rules as follows: — >■ 

C[ui , . . . , Um] where C is a constructor context containing no variables and each 
Ui is either a variable or a function call of the form fi(x\, . . . For each 

rewrite rule, we create the constraint 

where Xi = I d2 if ut is a variable and Xi = Fi if Ui = fi{x\, . . . ,x\.) and 9 is 
the substitution {ui/i2 | Ui is a variable}. 

If this rewrite rule is not linear and not recursive, the FiS are defined indepen- 
dently of F. So we can make some variables equal by computing the intersectioi 0 
of the constraint argument {Xi , . . . , A„) with the regular relation 

}(si, . . . 5 /c -|_2 , • . • t , Sp-|_2 T(J } 

However, the resultant constraint system is not necessarily a WRR: in the 
previous example, the constraint is recursive and not regular. The non-regularity 
comes from the fact that when going through the leaves from left-to-right, we 
get X, y for the Ihs, and y, x for the rhs: there is a variable permutation. If 
we remove the permutation by considering the rule /(c(x,y)) — >■ d(x,/(y)), the 
resultant constraint is regular. But the absence of variable permutation does not 
ensure regularity. 

Example 11 . Let /(a, c{x,y)) — >■ s{f{x,y)). 

The corresponding constraint F D (s(li), a, c(l2, 13)) F is not regular because 
there is an internal synchronization in the third component. 

And the presence of permutation does not imply non-regularity. 

Example 12 . Let f{c{x,y), s(z)) — >■ c(/(x, z), y). 

The corresponding constraint F A (c(li, 2 i), 0(12,22), s(l3))(F, 7^2) is regular. 

This is why we introduce the following definition: 

Definition 4 . A constructor-based rewrite system is weak-regular if function 
positions in rhs’s are shallow, its recursive rules are linear, and the corresponding 
constraint system generated by the above algorithm is a WRR. 



if one of the FiS is not a WRR, the algorithm fails. 
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Note that weak-regularity implies in particular weak-horizontality of rewrite 
rules. Thanks to decidability of emptiness for WRRs, we get: 

Theorem 4. The unifiahility of a weak-horizontal goal modulo a weak-regular 
confluent constructor-based rewrite system is decidable. Moreover, the unifiers 
can be expressed by a WRR. 

6 Application to One-Step-Rewriting 

Given a signature S, the theory of one-step rewriting for a finite rewrite system 
R is the first order theory over the universe of ground If-terms that uses the 
only predicate symbol — where x ^ y means x rewrites into y by one step. 

It has been shown undecidable in m- Sharper undecidability results have 
been obtained for some subclasses of rewrite systems, about the 3*V*-fragment 
[I15ll2j and the 3*V*3*-fragment [ITj . 

It has been shown decidable in the case of unary signatures jOI, in the case 
of linear rewrite systems whose left and right members do not share any vari- 
ables eD. for the positive existential fragment for the whole existential 
fragment in the case of quasi-shallov0 rewrite systems Q and also in the case of 
linear, non-overlapping, non-e-left-right-overlappinj^ rewrite systems 0. 
Thanks to WRRs, we get a new result about the existential fragment. 

Definition 5. A rewrite system R is e- left-right-clashing if for all rewrite rule 
I — >■ r, /(e) ^ r(e). R is horizontal if in each rewrite rule, all occurrences of the 
same variable appear at the same depth. 

Note that e-left-right-clashing excludes collapsing rules. 

Theorem 5. The existential one-step rewriting theory is decidable in the case 
of e-left-right- clashing horizontal rewrite systems. 

Since quasi-shallowness is a particular case of horizontality, our result extends 
that of except that we assume in addition e-left-right-clashing. Compared 
to |H|, rewrite rules may now be non-linear and overlapping, but they must be 
horizontal. 

Consider a finite rewrite system R = {ru\, . . . ,run} and an existential for- 
mula in the prenex form. Our decision procedure consists in the following steps: 

1. Since the symbols of E are not allowed in formulas, every atom is of the form 
X ^ X or X ^ y. Because of e-left-right-clashing, x ^ x has no solutions and 
is replaced by the predicate without solutions T, and x ^ y is replaced by 

7 7 7 

the equivalent proposition [x — 2 /V. . .Vx ^[ru„] "r V-, where — tpn;] 

is the rewrite relation in zero or one step with rule rui. Next, the formula 

^ Even the theory of several-step rewriting is decidable. 

® All variables in the rewrite rules occur at depth one. 

® I.e. no left-hand-side overlaps on top with the corresponding right-hand-side. 
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is transformed into a disjunction of conjunctions of (possibly) negations of 
? 

atoms of the form x -^[rm] y or x = y. We show that the set of solutions of 
each atom, and of its negation, is a WRR. 

2. The solutions of a conjunctive factor are obtained by making cartesian prod- 
ucts with the set Ts of all ground terms (which is a particular WRR), as well 

? 7 

as intersections. For instance let C = x — tfrui] y A ~<[y ~^[ru 2 ] ^)- The solu- 

7 7 7 

tions of X — y, denoted SOL{x — y)), and those of -'{y — t[r« 2 ] -^)i 

7 

denoted SOL{-'{y -t[r« 2 ] are WRRs of pairs, then we can compute 

SOL{C) = SOL{x y)) X Ti; n Ts X SOL{-i{y ->[^ 2 ] z)), which is 
still a WRR (of triples). 

3. The validity of the formula is tested by applying the WRR emptiness test 
on every disjunctive factor. 

SOL{x = y) and SOL{x ^ y) are trivially WRRs since they are RRs. 

7 7 

Lemma 4. SOL{x -^[rm] v) o.'nd SOL{->{x -^[rm] y)) o.re WRRs. 

7 Further Work and Conclusion 

Computing descendants through a rewrite system may give rise to several appli- 
cations. m shows that the set of descendants of a regular tree language through 
a rewrite system is still regular, assuming some restrictions. Using non-regular 
languages, like WRRs, still closed under intersection (for applications), could 
extend the result of m by weakening the restrictions. 

Compared to automata with (dis)equality constraints P|, WRRs can define 
more constraints than only (dis) equality, but they cannot define the balanced 
terms. However, WRRs and the subclass of reduction automata have something 
in common: when deriving (recognizing) a term, the number of non-regular con- 
straints (of (dis)equality constraints) applied is supposed to be bounded. 

As shown in Example 0 the intersection of some horizontal CSs that are not 
necessarily WRRs, is still a horizontal CS. So, is there a subclass of horizontal 
CSs, larger than WRRs, and closed under intersection? 
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Abstract. We determine the parallel complexity of several (uniform) 
membership problems for recognizable tree languages. Furthermore we 
show that the word problem for a fixed finitely presented algebra is in 
DLOGTIME-uniform NGk 



1 Introduction 

Tree automata are a natural generalization of usual word automata to terms. 
Tree automata were introduced in HUE] and PZI in order to solve certain deci- 
sion problems in logic. Since then they were successfully applied to many other 
decision problems in logic and term rewriting, see e.g. \J\. These applications 
motivate the investigation of decision problems for tree automata like empti- 
ness, equivalence, and intersection nonemptiness. Several complexity results are 
known for these problems, see m for an overview. Another important decision 
problem is the membership problem, i.e, the problem whether a given tree au- 
tomaton accepts a given term. It is easily seen that this problem can be solved 
in deterministic polynomial time |2|, but up to now no precise bounds on the 
complexity are known. 

In this paper we investigate the complexity of several variants of membership 
problems for tree automata. In Section 01 we consider the membership problem 
for a fixed tree automaton, i.e, for a fixed tree automaton A we ask whether a 
given input term is accepted by A. We prove that this problem is contained in the 
parallel complexity class DLOGTIME-uniform NC^, and furthermore that there 
exists a fixed tree automaton for which this problem is complete for DLOGTIME- 
uniform NG^. Using these results, in Section El we prove that the word problem 
for a fixed finitely presented algebra is in DLOGTIME-uniform NG^. This result 
nicely contrasts a result of Kozen that the uniform word problem for finitely pre- 
sented algebras is P-complete m Finally in Section |3 we investigate uniform 
membership problems for tree automata. In these problems the input consists of 
a tree automaton A from some fixed class C of tree automata and a term t, and 
we ask whether A accepts t. For the class C we consider the class of all deter- 
ministic top-down, deterministic bottom-up, and nondeterministic (bottom-up) 
tree automata, respectively. The complexity of the corresponding uniform mem- 
bership problem varies between the classes log-space and LOGGFL, which is the 
class of all languages that can be reduced in log-space to a context free language. 
Again we prove several completeness results. Table E at the end of this paper 
summarizes the presented complexity results for membership problems. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 201- 171^ 2001. 

© Springer- Verlag Berlin Heidelberg 2001 
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2 Preliminaries 

In the following let 17 be a finite alphabet. The empty word is denoted by e. The 
set of all finite words over S is E* . We set 27+ = 27*\{e}. For F C E we denote 
by |s|r the number of occurrences of symbols from F in s. We set |s| = |s|j;. For 

a binary relation — >■ on some set we denote by — > (— ^) the transitive (reflexive 
and transitive) closure of — >■. Context-free grammars are defined as usual. If 
G = (TV, 27, S, P) is a context-free grammar then TV is the set of non-terminals, 
27 is the set of terminals, S' G TV is the initial non-terminal, and P C TVx (TVU27)* 
is the finite set of produetions. With — we denote the derivation relation of G. 
The language generated by G is denoted by L{G). A context-free grammar is 
e-free if it does not contain productions of the form A — >■ e. 

We assume that the reader is familiar with the basic concepts of computa- 
tional complexity, see for instance m- We just recall a few definitions concerning 
parallel complexity theory, see inni for more details. It is not necessary to be fa- 
miliar with this field in order to understand the constructions in this paper. L de- 
notes deterministic logarithmic space. The definition of DLOGTIME-uniformity 
and DLOGTIME-reductions can be found in 0. An important subclass of L is 
DLOGTIME-uniform NG^, briefly uNG^. More general, for fc > 1 the class uNG^ 
contains all languages K such that there exists a DLOGTIME-uniform family 
{Cn)n>o of Boolean circuits with the following properties: (i) for some constant 
c the depth of the circuit is bounded by c • log(n)*', (ii) for some polynomial 
p(n) the size of C„, i.e., the number of gates in C„, is bounded by p{n), (iii) all 
gates in C„ have fan-in at most two, and (iv) the circuit recognizes exactly 
the set of all words in K of length n. By uNG^ is equal to ALOGTIME. 

An important subclass of uNG^ is DLOGTIME-uniform-TG'^, briefly uTG°. 
A language K is in uTG° if there exists a DLOGTIME-uniform family (C„)„>o of 
circuits built up from Boolean gates and majority gates (or equivalently arbitrary 
threshold-gates) with the following properties: (i) for some constant c the depth 
of the circuit C„ is bounded by c, (ii) for some polynomial p{n) the size of is 
bounded by p{n), (iii) all gates in C„ have unbounded fan-in, and (iv) the circuit 
Cn recognizes exactly the set of all words in K of length n. For more details see 
0. In this paper we will use a more convenient characterization of uTG° using 
first-order formulas with majority quantifiers, briefly FOM-formulas. Let 27 be 
a fixed finite alphabet of symbols. An FOM-formula is built up from the unary 
predicate symbols Qa (« G 27) and the binary predicate symbols < and BIT, 
using Boolean operators, first-order quantifiers, and the majority quantifier M. 
Such formulas are interpreted over words from 27+ . Let w = Oi • • • am, where 
m > 1 and Oi G E for i G {I,... ,m}. If we interpret an FOM-formula over 
w then all variables range over the interval {!,... ,rn}, < is interpreted by 
the usual order on this interval, BIT(n,T) is true if the T-th bit in the binary 
representation of n is one (we will not need this predicate any more), and Qa{x) 
is true if ax = a. Boolean connectives and first-order quantifiers are interpreted 
as usual. Finally the formula Mxip{x) evaluates to true if <p{x) is true for at 
least half of all a; G {1, . . . , m}. The language defined by an FOM-sentence ip is 
the set of words from 27+ for which the FOM-sentence p evaluates to true. For 
instance the FOM-sentence 
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MxQa{x) A MxQb{x) A yx,y{x<y -)> -'(<5b(a;) A Qa{y))} 

defines the language {a^b^ | n > 1}. It is well-known that uTC° is the set of 
languages that can be defined by an FOM-sentence 1^. 

In FOM-formulas we will often use constants and relations that can be easily 
defined in FOM, like for instance the equality of positions or the constants 1 and 
max, which denote the first and last position in a word, respectively. Furthermore 
by |2l Lemma 10.1] also the predicates x+y = z and x = i.e, the number 

of positions y that satisfy the FOM-formula y}{y) is exactly x, can be expressed 
in FOM. Finally let us mention that uNC^ also has a logical characterization 
similar to uTC°. The only difference is that instead of majority quantifiers so 
called group quantifiers for a non-solvable group are used, see |2j for the details. 
Of course the resulting logic is at least as expressive as FOM. 

In this paper we also use reductions between problems that can be defined 
within FOM. Formally let / : — >■ T+ be a function such that for some 

constant k we have |/(w)| < k ■ \w\ for all w G F''*". Q Then we say that / is 
FOM-definable if there exist formulas cj>{x) and 4>a{x) for a G F such that when 
interpreted over a word w and i G {1, ... , fc • jrcj} then evaluates to true if 
and only if i = \f{w)\, and evaluates to true if and only if the i-th symbol 
in f(w) is a (here also all quantified variables in (j) and (j)a range over the interval 
{!,... , jwj}). We say that cj) and (j)a (a G F) define /. 

Lemma 1. Let f : ^ O’*" be FOM-definable and let L C K C F~^ sueh 

that w G L if and only if f{w) G K (in this ease we say that L is FOM-reducible 
to K). If K is in uTCfi (resp. uNC^ ) then also L is in uTCfi (resp. uNC^ ). 



Proof. Let (f>, (pa (a- G F) be FOM-formulas that define the function / and let K 
be in uTC°, i.e., it can be defined by an FOM-sentence ip. Let \f{w)\ < k-\w\ for 
all w G L'+. In the following we restrict to the case k = 2, the generalization to an 
arbitrary k is obvious. In principle we can define the language L by the sentence 
that results from by replacing every subformula Qa{x) by the formula pa{x). 
The only problem is that if we interpret this sentence over a word w then the 
variables quantified in ip have to range over the interval {1, . . . , j/(rc)|}. Hence 
we define L by the FOM-sentence 3z {{(p{z) A ip^’^) V {(p{m.ax z) A ip^’^)}, 
where the sentence ip^’’‘ is inductively defined as follows: 



(3a; ip{x))^’ 


* = 3a; {(a; < i 


• max A 


ip{xY’'‘) V (x < z A 1 


p{x -\- i ■ 




i 


xi = #y 


(y < i ■ max A (p{yY'’' 


0 A 


{Mx (p{x)) 


= 3a;i,a:2 < 


X 2 = fpy 


{y < z A if{y i ■ max)^’*) A 




1 


LU 


+ xY) = y— l-\-z-\-i 


• max} 


QaixY’^ = 


pa{x) and Qa(x max)^ 


= ^a(a; -1- max) 





If K belongs to uNC^ the arguments are similar using the logical characterization 
ofuNC^ □ 



^ This linear length-bound may be replaced by a polynomial bound, but this is not 
necessary for this paper. 
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LOGCFL (respectively LOGDCFL) is the class of all languages that are log- 
space reducible to a context-free language (respectively deterministic context- 
free language) . In it was shown that LOGGFL is the class of all languages 
that can be recognized in polynomial time on a log-space bounded auxiliary push- 
down automaton, whereas the deterministic variants of these machines precisely 
recognize all languages in LOGDGFL. The following inclusions are well-known 
and it is conjectured that they are all proper. 

uTG° C uNG^ = ALOGTIME C L C LOGDGFL C LOGGFL C uNG^ C P 

A ranked alphabet is a pair {tF, arity) where .7^ is a finite set of function 
symbols and arity is a function from iF to N which assigns to each a S iF its 
arity arity (a). A function symbol a with arity (a) = 0 is called a constant. In 
all examples we will use function symbols a and /, where arity(a) = 0 and 
arity(/) = 2. Mostly we omit the function arity in the description of a ranked 
alphabet. With tFi we denote the set of all function symbols in T of arity i. In 
FOM- formulas we use as an abbreviation for \J Qa{x). Let A be a 

countably infinite set of variables. Then denotes the set of terms over 

J- and A, it is defined as usual. The word tree is used as a synonym for term. We 
use the abbreviation T(iF, 0) = T{tF), this set is called the set of ground terms 
over T . We identify the set T{T) with the corresponding free term algebra over 
the signature T . In computational problems terms will be always represented by 
their prefix-operator notation, which is a word over the alphabet T . The set of all 
these words is known as the Lukasiewicz-language L(iF) for the ranked alphabet 
T . For instance f fafaafaa S F(.F) but fafaf ^ F(.F). When we write terms 
we will usually use the prefix-operator notation including brackets and commas 
in order to improve readability. 

Lemma 2. For every ranked alphabet T the language L{tF) C A'*' is in uTCP . 

A similar result for Dyck-languages was shown in 

Proof. Let m = max{arity(a) | a G A}. For s € define 

m 

l|s|l = ■ 1'®!-^-- 

i=0 

Then for s S .F+ it holds s S L(A) if and only if ||s|| = —1 and ||f|| > 0 for every 
prefix t yf s of s, see ^3 p 323]. This characterization can be easily converted 
into an FOM-sentence: 

m m 

; ■ ■ ■ 7 ^ ^ ^ 1 ) * ^ 

2—0 2—0 

m m 

\/y < max 3x0, ■ ■ ■ ,Xm{f\xi = ffz{QJ^^{z)^z<y) A ^(i - 1) • Xi > 0} 
2=0 2=0 



□ 



On the Parallel Complexity of Tree Automata 205 



If the ranked alphabet T is clear from the context then in the following we will 
always write L instead of b(lF). From the formula above it is straight forward to 
construct a formula L(f , j) which evaluates to true for a word C and 

two positions j £ {1, ■ ■ • ,n} if and only if i < j and ai - ■ ■ ttj £ L. The height 
height(f) of the term t £ T{J^) is inductively defined by height(a(ti, . . . ,t„)) = 
1 + max{height(<i), . . . , height(t„)}, where arity(a) = n > 0 and ti,... £ 

T(fF) (here max(0) = 0). 

A term rewriting system, briefly TRS, over a ranked alphabet is a finite 
set TZ C T{T,X) x T{fF,X) such that for all (s,t) £ TZ every variable that 
occurs in t also occurs in s and furthermore s ^ X. With a TRS TZ the one- 
step rewriting relation over T{T,X) is associated as usual, see any text on 
term rewriting like for instance m- A ground term rewriting system P is a finite 
subset of T{X) x T{T), i.e., the rules only contain ground terms. The symmetric, 
transitive, and reflexive closure of the one-step rewriting relation — of a ground 
TRS is the smallest congruence relation on the free term algebra T{T) that 
contains all pairs in T’, it is denoted by =-p . The corresponding quotient algebra 
T{TF)/ =-p is denoted by A{TF,T’), it is a finitely presented algebra. 

For a detailed introduction into the held of tree automata see [141/] . A top- 
down tree automaton, briefly TDTA, is a tuple A = (Q,J-,qo,TZ), where Q 
is a finite set of states, Q U iF is a ranked alphabet with arity(g) = 1 for all 
q G Q, qo G Q is the initial state, and 7?. is a TRS such that all rules of TZ have 
the form g(a(a;i, ... ,a;„)) -)■ a(gi(a;i), ..., g„(a;„)), where q,qi,...,qn G Q, 
Xi, ... ,Xn G X , a G T , and arity(a) = n. A is a deterministic TDTA if there 
are no two rules in TZ with the same left-hand side. The language that is accepted 
by a TDTA A is defined by 

T{A) = {tG T{T) I qo{t) t}. 

A hottom-up tree automaton, briefly BUTA, is a tuple A = {Q,T ,qf,TZ), where 
Q is a finite set of states, Q U iF is a ranked alphabet with arity(( 7 ) = 1 for all 
q G Q, qf G Q is the final state, and 7?. is a TRS such that all rules of TZ have 
the form a{qi{xi),... ,qn{xn)) q{a{xi,... ,Xn)), where q,qi,... ,qn G Q, 
x\, . . . ,Xn G X, a G TF, and arity(a) = n. A is a, deterministic BUTA if there 
are no two rules in TZ with the same left-hand side. The language that is accepted 
by a BUTA A is defined by 

T{A) = {tGT{T)\t^n 7/(7)}. 

It is well known that TDTAs, BUTAs, and deterministic BUTAs, respectively, 
all recognize the same subsets of T{TF). These subsets are called recognizable tree 
languages over T . On the other hand deterministic TDTAs cannot recognize all 
recognizable tree languages. 

As already remarked, if a term is part of the input for a Turing machine 
then the term will be encoded by its corresponding word from F(.F), where the 
symbols from T are binary coded. A tree automaton will be encoded by basically 
listing its rules, we omit the formal details. The membership problem for a fixed 
TDTA A, defined over a ranked alphabet iF, is the following decision problem: 
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INPUT: A term t G T{J^). 

QUESTION: Does t G T{A) hold? 

If the TDTA A is also part of the input we speak of the uniform membership 
problem for TDTAs. It is the following decision problem: 

INPUT: A TDTA A = (Q, iF, go, 7?.) and a term t G T{iF). 

QUESTION: Does t G T{A) hold? 

(Uniform) membership problems for other classes of automata or grammars are 
defined analogously. Note that the uniform membership problem for TDTAs can 
be reduced trivially to the uniform membership problem for BUTAs (and vice 
versa) by reversing the rules. Thus these two problems have the same computa- 
tional complexity. 

3 Membership Problems 

In this section we will study the membership problem for a fixed recognizable 
tree language. First we need some preliminary results. 

A parenthesis grammar is a context-free grammar G = (N, E, S, P) that 
contains two distinguished terminal symbols ( and ) such that all productions of 
G are of the form A — >■ (s), where A € N and s € {N U A'\{(, )})*. A language 
that is generated by a parenthesis grammar is called a parenthesis language. 
Parenthesis languages where first studied in m In ^ it was shown that every 
parenthesis language is in uNC^. 

Lemma 3. Every recognizable tree language is FOM-redueible to a parenthesis 
language. Furthermore the uniform membership problem for TDTAs is log-space 
reducible to the uniform membership problem for parenthesis grammars. 

Proof. Let A = {Q,P,qo,TZ) be a TDTA. Let G be the parenthesis grammar 
G = {Q,PLI {{,)}, qo,P) where 

P= {fqi ■■■qm) \ q{f{xi, ... , Xm)) -t /(gi(xi), . . . , qm{Xm)) £ Tl} ■ 

Let us define a function /3 : T{T) — >• (.F U {(, )})+ inductively by /3(/ti ■ ■ • tm) = 
(//3(^i) • ■ ■ l3{tm)) for / G Pm and ti, . . . Am £ h(F). Then we have t G T{A) if 
and only if f3{t) G L{G). Thus by Lemma Dh suffices to show that the function 
[3 is FOM-definable. Let t = a\ - ■ ■ an, where aj G T . Then in order to construct 
P{t) from t, an opening bracket has to be inserted in front of every symbol in t. 
Furthermore for j G {!,... ,n} the number of closing brackets following aj in 
P{t) is precisely the number of positions i < j such that ai - ■ ■ aj G L. Hence f3 
can be defined by the following formulas, where a G F: 

4>{x) = a: = 3 • max 

(faix) = 3y,z{Qai,y) A z = ffi{3j{j <y A L(i,j))) A x = 2y + z] 
(l^(ix) = V 4>aix-\-l) 
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For the second statement note that in the uniform case all constructions can be 
easily done in log-space. □ 

Theorem 1. Let T be a fixed recognizable tree language. Then the member- 
ship problem for T is in uNC^ . Furthermore there exists a fixed deterministic 
TDTA A such that the membership problem for T{A) is uNC^ -complete under 
DLOGTIME-reductions. 

Proof. The first statement follows from Lemma 0 and the results of . For the 
hardness part let L C S* be a fixed regular word language, whose membership 
problem is uNC^-complete under DLOGTIME-reductions. By j3 Proposition 
6.4] such a language exists. If we define arity(a) = 1 for all a G 17 and let ff ^ E 
be a constant then we can identify a word ai 02 ■ ■ ■ an G S* with the ground 
term 0102 • • • a„# G T{E U {#}), and the language L can be recognized by a 
fixed deterministic TDTA. □ 

4 Word Problems for Finitely Presented Algebras 

In this section we present an application of Theorem Q to the word problem 
for a finitely presented algebra. The uniform word problem for finitely presented 
algebras is the following problem: 

INPUT: A ranked alphabet iF, a ground TRS V over IF, and ti,t 2 G T{iF). 
QUESTION: Does U =-p t 2 hold? 

In j‘2 1] it was shown that the uniform word problem for finitely presented algebras 
is P-complete. Here we will study the word problem for a fixed finitely presented 
algebra A{E,V), where .7^ is a fixed ranked alphabet, and V is & fixed ground 
TRS over 

INPUT: Two ground terms ti,t 2 G T{T). 

QUESTION: Does h =-p t 2 hold? 

For the rest of this section let us fix two ground terms U, t 2 C T{T). We want to 
decide whether t\ =-p ^ 2 - The following definition is taken from j0|. Let 17 be a 
new constant. Let A = (iFU {17}) x (iFU {17|)\{(17, 17)} and define the arity of 
[a, /3] G A by max{arity(o;), arity (/3)}. We define the function a : T{T)xT{T) — )> 
T(A) inductively by 

cr(/(ui, . . . ,u„), 5 (ui, . . . ,v„)) = 

[f,g](a(ui,vi), . . . ,cr(u„,u„),(r(u„+i,l7),... ,cr(u™,f7)) 

li m > n plus the symmetric rules for the case m < n. The term a{ti,t 2 ) is a 
kind of parallel superposition of U and t 2 - 

Example 1. Let t\ = faffafaaa, t 2 = f f faafaafaa. Then (j{ti,t 2 ) is the term 
[/, /] [a, /] [1^, /] a] [1^, a] [17, /] [17, a] [17, a] [/, /] [/, a] [a, 17] [/, 17] [a, 17] [a, 17] [a, a] . 

In 0 the was shown that the set T-p = {a{ti,t 2 ) \ t\,t 2 G =p 0} is 

recognizable. Since V is & fixed ground TRS, T-p is also a fixed recognizable tree 
language. Thus by TheoremQ]we can decide in uNC^ whether a term t G T(A) 
belongs to Tp. Therefore in order to put the word problem for A(T, T) into 
uNC^ it suffices by Lemma d to prove the following lemma: 
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Lemma 4. The function a is FOM-definable. 

Proof. We assume that the input for a is given as tit 2 - Let xG ,|tz|}- Then 

the x-th symbol of ti corresponds to a node in the tree associated with ti, and we 
denote the sequence of numbers that labels the path from the root to this node 
by Pi{x), where each time we descend to the k-th. child we write k. For instance 
if ti = faffafaaa then pi{7) = 2121. This sequence can be also constructed 
as follows. Let us fix some constant a G J-q. Let s be the prefix of ti of length 
X — 1. Now we replace in s an arbitrary subword which belongs to L\{a} by a 
and repeat this as long as possible. Formally we define a function 77 inductively 
by 7T(s) = s if s ^ J^*(L\{a})J^* and II{vtw) = U{vaw) if t G L\{a}. We have 
for instance 7T(/o//) = faff and II{f f faafaafa) = fafa. Then it is easy 
to see that Pi{x) = ki - ■ - km if and only if 77(s) = ■ ■ ■ fma^"'~^ where 

arity(/j) > 0 for j G {1, . . . , m}. 

First we construct an FOM-formula c(xi,X 2 ) that evaluates to true for two 
positions xi G {1, . . . , |ti|} and X 2 G {1, . . . , |t 2 |} if and only if pi(xi) = ^ 2 ( 3 ^ 2 )- 
For this we formalize the ideas above in FOM. In the following formulas we use 
the constants oi = 0 and 02 , where 02 is uniquely defined by the formula L(l, oi). 
Thus, if interpreted over the word t\t 2 , we have 02 = |ti| and max — 02 = |72|- 
Furthermore let I\ = {1,... , 02 } and I 2 = {I,-- - ,max— 02 }. Quantification 
over these intervals can be easily done in FOM. If 7^ = oi • • • and 1 < x < n 
then the formula evaluates to true if r < x, G L, and the 

interval between the positions i and r is maximal with these two properties. The 
formula TTi{u, x) evaluates to true if u = |7T(ai • • • ax-i) \ and finally fi{u, x) eval- 
uates to true if the u-th symbol of 7T(ai • • • a^-i) has a nonzero arity. Formally 
for 7 G {1,2} we define: 



<Pi{e,r, x) 



TTi{u,x) 



Mu,x) 



c(Xi,X2) 



_ r r < X A L(7 -I- Oi, r -I- ot) A 1 

“ } -idy, z G Ii{y < i A r < z < X A L(y + Oi,z + o^)} J 

= X = M -I- 1 -I- (37, r G 7i r,x) A I < z < rj) 

= 30{z<x A ~'3i,r G Ii{ipi{£,r,x) A £ < z < r) A Tri{u 

^ r7ri(u,Xi) A 7T2 (u,X2)A 1 

~ <y <u ^ ifi{y,xi) GA f 2 {y,x 2 )))j 



1 .^)} 



Finally we can define the functions a by the following formulas, where a, (3 G £F 
(the formulas 4>[n,a]{x) and 4>[a,n]{x) can be defined similarly to 4>[a,p]{x))'. 



(j){x) = max = X -7 #?/ G 7i (3?/ G I 2 (c(?/i, 2 / 2 ))) 



^[a,0] (^) 



(c{y,z) A Qa{y) A Qf 3 {z + 02 )A 

\y + z = x + #y' {y' <y A 3z' < z {c{y', z'))) 



□ 



Example 2. Let 7i, £2 be from Example H In the following picture two positions 
satisfy the formula c(y, z) if they are connected by a line. If x = 15 then the 



On the Parallel Complexity of Tree Automata 



209 



formula (j)[a,a]{x) is satisfied if we choose y = 9 and z = 11. Indeed, the 15-th 
symbol of cr(ti,t 2 ) is [a, a]. 
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Corollary 1. For every finitely presented algebra the word problem is in uNC^ . 

Clearly there are also finitely presented algebras whose word problems are uNC^- 
complete, like for instance the Boolean algebra ({0, 1}, A,V) |^. An interesting 
open problem might be to find criteria for a finitely presented algebra A{T,V) 
which imply that the word problem is uNC^-complete. For similar work in the 
context of finite groupoids see P]. 

We should also say a few words concerning the input representation. In The- 
oremQand CorollaryEwe represent the input terms as strings over the alphabet 
J-. This is in fact crucial for the uNC^-upper bounds. If we would represent in- 
put terms by their pointer representations then the problems considered would 
be in general L-complete. For instance if Boolean expressions are represented by 
their pointer representations then the expression evaluation problem becomes 
L-complete 0. For other problems on trees for which it is crucial whether the 
string or the pointer representation is chosen see Eng. For the uniform mem- 
bership problems in the next section the encoding of the input terms is not 
crucial for the complexity since these problems are at least L-hard regardless of 
the chosen encoding. 

5 Uniform Membership Problems 

In this section we will investigate uniform membership problems for TDTAs. 
First we need some preliminary results. 

Remark 1. The uniform membership problem for the class of all e-free context- 
free grammars is in LOGCFL. 

This fact seems to be folklore. In fact the usual algorithm for recognizing a 
context-free language on a push-down automaton can be implemented on a log- 
space bounded auxiliary push-down automaton also if the context-free grammar 
is part of the input. Furthermore if the grammar does not contain e-productions 
then this automaton runs in polynomial time. Thus Remark E follows from m- 
The next lemma is stated in a similar form in [241 Lemma 3]. 

Lemma 5. Let G = {N, A, S, P) be a context-free grammar in Chomsky normal 
form. Assume that A Ag s, where A G N , s G (NUS)* , and |s| > 2. Then there 
exist a factorization s = uivu 2 and B G N such that A Ag uiBu 2 , B Ag v 
and |u|, \uiBu 2 \ < | • |s|. 
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Proof. Consider a derivation tree T for the derivation A Ag s and let n = |s|. 
Since G is in Chomsky normal form T is a binary tree. For a node v of T let 
yield(z/) be the factor of s that labels the sequence of leafs from left to right of the 
subtree of T rooted at v. Consider a path p in T with the following two properties: 

(i) p starts at the root of T and ends at a leaf of T. (ii) If an edge {v, vi) of T 
belongs to p and and are the children of v then |yield(:^i)| > | yield ( 1 ^ 2 ) |- 
Let V be the first node on p with yield(:^) < | • n and let v' be the parent node 
of V. Thus yield(i^') > | • n. Let v be labeled with B G N and let yield(i^) = v. 

Thus there exists a factorization s = uivu 2 such that A Ag u\Bu 2 , B Ag 
and |u| < I • |s|. Furthermore since n > 2 and |u| > |yield(z/')|/2 > | • n we also 
have \u\Bu 2 \ = n— \v\ + l<n — □ 

Theorem 2. The uniform membership problem for the class of all TDTAs is 
LOGCFL-complete under log-space reductions. 

Proof. By the second statement from Lemma El and Remark [I] the uniform 
membership problem for TDTAs is in LOGCFL. It remains to show LOGCFL- 
hardness. For this we will make use of a technique from Proof of The- 
orem 2]. Let G = {N, E, P, S) be an arbitrary fixed context-free grammar 0 
and let w G S* . We may assume that G is in Chomsky normal form and that 
e ^ L{G). Let |tc| = n and T = {a,/}, where arity(a) = 0 and arity(/) = 2. 
We will construct a TDTA A = (Q,E,qo,TZ) and a term t € T{E) such that 
t G T{A) if and only if w G L{G). Furthermore A and t can be computed in 
log-space from w. Let 



W = {w\AiW 2 ■ ■ ■ WiAiWi+i I 0 < f < 3, Ai, . . . ,AiGN, 

wGE wiE W 2 ...Wih Wi+ih 



Thus W is the set of all s G {N U A)* with |s A < 3 such that a subword of w 
can be obtained by substituting terminal words for the non-terminals in s. Note 
that |LF| is bounded polynomially in |w| = n, more precisely |VF| G O(n^). The 
set Q of states of A is Q = {{A, s) | A G A^, s G W}. The state (A, s) may be 
seen as the assertion that A Ag s holds. The initial state qg is {S,w). Finally 
the set TZ contains all rules of the following form, where A G N, s G W, and 
u, ui, U 2 G {N U E)* such that uivu 2 G W. 

(1) (A, s)(a) — )> a if (A, s) G P 

(2) {A,s)lf{x,y)) -)> f{{A,s){x),{A,s){y)) if (A,s) G P 

(3) {A,uiVU 2 )if{x,y)) f{{A,uiBu 2 ){x),{B,v){y)) if\uiVU 2 \N < 3 or {\v\n = 
2 and \u 1 vu 2 \N = 3). 

Note that in (3), if U\VU 2 contains three non-terminals then we must choose a 
factorization uivu 2 such that v contains exactly two non-terminals. Then also 
U\Bu 2 contains exactly two non-terminals. On the other hand if U\VU 2 contains 
less then three non-terminals in (3) then we may choose any factorization. In 



^ In fact we may choose for G Greibach’s hardest context-free grammar m 
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this case both v and u\Bu2 contain at most three non-terminals. This concludes 
the description of A. Note that A can be constructed in log-space from w. For 
the definition of the term t we need some further notations. In the following let 
7 = 9/8 > 1 and for m > 0 let = 2 ■ |"log..y(m)] -I- 2 . Furthermore for m > 0 
let bal(m) G T{{a, /}) be a fully balanced term of height m, i.e., if m = 1 then 
bal(m) = a, otherwise bal(m) = /(bal(m — 1 ), bal(m — 1 )). Now let t = bal(g„). 
Since gn is logarithmic in n = the size of t is polynomially bounded in n and 
t can be constructed from w in log-space. We claim that w S L{G) if and only 
if t S T{A). For the if-direction it suffices to prove the following more general 
claim for all A S s S W, and t' € T({a, /}): 

If (A, s)(t') At?, t' then A Ag s. 

This statement can be shown easily by an induction on the structure of the term 
t'. For the (only if)-direction we will first show the following two statements, 
where A G N, s G W, A Ag s, and |s| = m > 0 (note that e ^ L{G)). 

( 1 ) If |s|at < 3 then {A, s){t') At? t' for some t' with height (t') < g™ — 1 - 

(2) If |s|tv = 3 then {A, s){t') At? t' for some t' with height(t') < gm- 

We prove these two statements simultaneously by an induction on the length of 
the derivation A Ag s. If m = 1 then the derivation A Ag s has length one. 
Choose t' = a, then (A, s){t') t' and height(t') = 1 = gi — 1 . If m = 2 then 
since G is in Chomsky normal form the derivation A Ag s has length at most 
3 . Let t' = f{f{a, a), a). Then {A, s)(t') At? t' and height(t') = 3 < 52 — 1 - Now 
assume m > 3 . First let |s|tv = 3 . Then there exist a factorization s = U\VU2 
and B G N with A Ag u\Bu2, B Ag v, and |z)|at = \u\Bu2\n = 2 . Let mi = 
\u\Bu2\ and m2 = |r'|, thus m = mi -I- m2 — 1 . W.l.o.g. assume mi > m2. Now 
the induction hypothesis implies {A,uiBu2)(ti) and {B,v)(t2) At? ^2 for 

some terms ti and t2 with height (ti) < g^i— 1 and height (^2) < 9m2~^- K follows 
{A,s){f{ti,t2)) At? f{ti,t2) where height(/(ti, <2)) = 1 -k height (ti) < g^i < 
gm since mi = m-|-l — m2 < m. Finally assume that |s| TV < 3 . By LemmaElthere 
exist a factorization s = U\VU2 and B G N such that A Ag u\Bu2, B Ag "c, 
and v,uiBu2 G W, and |uii?M2|,|u| < m/7. Let mi = \u\Bu2\ and m2 = |u|. 
The induction hypothesis implies {A,uiBu2)(ti) At? G and {B,v)(t2) At? ^2 
for some terms ti and ^2 such that height(ti) < gmi < 2- |'log^(™)] -1-2 = — 2 

and similarly height(t2) — It follows {A, s){f{t\A2)) At? f{ti,t2) where 
height(/(ti, ^2)) < gm — 1 - This concludes the proof of the statements ( 1 ) and ( 2 ). 
Now assume that w G L(G). Then {S,w){t') At? t' for some t' with height(t') < 
gn- But then the first two groups of rules of A imply {S, ■u;)(bal(g„)) At? bal(g„), 
i.e, t = bal(gn) G T(A). This concludes the proof of the theorem. □ 

Remark Q Theorem | 2 | and the second statement from Lemma El immediately 
imply the following corollary. 

Corollary 2 . The uniform membership problem for the class of all parenthesis 
grammars is LOGCFL-complete under log-space reductions. 
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In j1 5| it was shown that the problem of evaluating acyclic Boolean conjunc- 
tive queries is LOGCFL-complete. In order to show LOGCFL-hardness, in HSI 
Venkateswaran’s characterization m of LOGGFL in terms of semi-unbounded 
circuits is used. In fact the method from HS| may be modified in order to prove 
Theorem m On the other hand our proof does not use Venkateswaran’s result 
and seems to be more elementary. 

It should be also noted that since directed reachability in graphs is NL- 
complete izni, the uniform membership problem for usual nondeterministic word 
automata is NL-complete. Thus, since NL is supposed to be a proper subset of 
LOGGFL, for the nondeterministic case the complexity seems to increase when 
going from words to trees. The next theorem shows that this is not the case for 
the deterministic case if we restrict to TDTAs. 

Theorem 3. The uniform membership problem for the class of all deterministic 
TDTAs is L-complete under DLOGTIME-reductions. 

Proof. Hardness follows from the fact that the uniform membership problem for 
deterministic word automata is L-complete under DLOGTIME-reductions, see 
e.g. [B| and the remark in m Theorem 15]. For the upper bound we will use 
an idea that appeared in a similar form in ini Section 4] in the context of tree 
walking automata with pebbles. Let A = {Q,iF, qo,TV) be a deterministic TDTA 
and let t G T{T). We will outline a high-level description of a deterministic log- 
space Turing machine that decides whether t G T(A). We use a result of ca, 
which roughly speaking says that in order to check whether a tree is accepted 
by a deterministic TDTA it suffices to check each path from the root to a leaf 
separately. 

We assume that the input word t G L is stored on the input tape starting at 
position 1. In the following we will identify a position i G {1, .. . ,|t|}on the input 
tape with the corresponding node of the tree t. For the term t = ffaaffaaa for 
instance, position 1 is the root of t and 3, 4, 7, 8, and 9 are the leafs of t. In the 
high-level description we will use the following variables: 

— hi G {1, ■ . ■ , jtj} (i £ {1, 2}) is a position on the input tape. With hi we visit 
all nodes of t. Each time hi visits a leaf of t, with /12 we walk down the path 
from the root 1 to hi and check whether it is accepted by A. 

— fiGJ-{iG {1,2}) is the label of node hi. 

— q G Q is the state to which node /i 2 evaluates under the automaton A. 

All these variables only need logarithmic space. We use the following routines: 

— brother(/i) returns the position of the right brother of h, or undefined if h 
does not have a right brother. This value can be calculated in log-space by 
counting, using the characterization of L from HD, see the proof of Lemma 0 

— 6{f, q, i), where f G q G Q, and i £ (1, . . . , arity(/)}, returns the state q' 
such that if q{f{xi, . . . ,a;„)) -)■ f{qi{xi), . . . ,( 7 „(a;„)) £ 'll then q' = qi. 

For instance for the term t above we have brother(2) = 5. Finally we present the 
algorithm. It is clear that this algorithm runs in log-space. 
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for hi := 1 to |t| do 

if arity(/i) = 0 then 
q := go; ft.2 := 1; 
while /i 2 < hi do 

/ •= f2] i '■= 1 ; ^2 := ^2 + 1 ; 

while brother(/i2) is defined and brother(/i2) < hi do 
i := i + 1 ] ft.2 := brother(/i2) 
endwhile 
q := S{f,q,i) 

endwhile 

if ('?(/2) /2) ^ then reject 

endfor 

accept 

□ 

Finally we consider deterministic BUTAs. Note that the uniform membership 
problem for nondeterministic BUTAs was implicitly considered in Theorem El 
since the uniform membership problems for nondeterministic BUTAs and non- 
deterministic TDTAs, respectively, can be directly translated into each other. 

Theorem 4. The uniform membership problem for the class of all deterministic 
BUTAs is in LOGDCFL. 

Proof. Let A = {Q,T ,qf,TZ) be a deterministic BUTA and let t S T{T). Let 
# ^ {0, 1} be an additional symbol. By ES] it suffices to outline a deterministic 
log-space bounded auxiliary push-down automaton M. that checks in polynomial 
time whether t C T{A). The input word t is scanned from from right to left. 
A sequence of the form ffbin(qi)ffbin(q2) ■ ■ ■ ffhin(qm) is stored on the push- 
down, where bin(qi) is the binary coding of the state qi G Q and the top- 
most push-down symbol corresponds to the right-most symbol in this word. 
The length of this coding is bounded logarithmically in the input length. If A 4 
reads the symbol / from the input, where arity(/) = n, then AA replaces the 
sequence #bin(gi):;^bin(g2) • • • #bin(g„) by the sequence #bin(g) on top of the 
push-down, where f{qi{xi ), . . . ,gn(xn)) —t g(f(xi, ■ ■ ■ ,Xn)) is a rule in 7 Z. The 
auxiliary tape is used for storing binary coded states. □ 

The precise complexity of the uniform membership problem for deterministic 
BUTAs remains open. For the lower bound we can only prove L-hardness. This 
problem has also an interesting reformulation in terms of finite algebras. A de- 
terministic BUTA A corresponds in a straight-forward way to a finite algebra 
A. The carrier set of A is the set Q of states of A and every function symbol / 
of arity n is interpreted as an n-ary function on Q. Now the question whether a 
term t is accepted by A is equivalent to the question whether the expression t 
evaluates in the algebra A to a distinguished element q (namely the final state of 
A). Thus the uniform membership problem for deterministic BUTAs is equiva- 
lent to the uniform expression evaluation problem for finite algebras. In the case 
of a fixed groupoid, the complexity of the expression evaluation problem was 
considered in El- 
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Table HI summarizes the complexity results for tree automata shown in this 
paper. 



Table 1. Complexity results for tree automata 





det. TDTA 


det. BUTA 


TDTA (BUTA) 


membership 


uNC^ -complete 


uN C ^ -complete 


uNC'^-complete 


uniform 

membership 


L-complete 


LOGDCFL 


LOGCFL-complete 
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Abstract. We provide some new results concerning the use of trans- 
finite rewriting for giving semantics to rewrite systems. We especially 
(but not only) consider the computation of possibly infinite constructor 
terms by transfinite rewriting due to their interest in many programming 
languages. We reconsider the problem of compressing transfinite rewrite 
sequences into shorter (possibly finite) ones. We also investigate the role 
that (finitary) confluence plays in transfinite rewriting. We consider dif- 
ferent (quite standard) rewriting semantics (mappings from input terms 
to sets of reducts obtained by -transfinite- rewriting) in a unified frame- 
work and investigate their algebraic structure. Such a framework is used 
to formulate, connect, and approximate different properties of TRSs. 



1 Introduction 

Rewriting that considers infinite terms and reduction sequences of any ordinal 
length is called transfinite rewriting] rewriting sequences of length uj are often 
called infinitary. The motivation to distinguish between them is clear: reduction 
sequences of length of at most w seem more adequate for real applications (but 
transfinite rewriting is suitable for modeling rewriting with finite cyclic graphs) . 
There are two main frameworks for transfinite rewriting: considers 

standard Cauchy convergent rewriting sequences; !KKMV0,^| only admits strongly 
convergent sequences which are Cauchy convergent sequences in which redexes 
are contracted at deeper and deeper positions. Cauchy convergent sequences are 
more powerful than strongly convergent ones w.r.t. their computational strength, 
i.e., the ability to compute canonical forms of terms (normal forms, values, etc.). 

Example 1. Consider the TRS (see jKKSV95] ): 

f(x,g) f(c(x),g) g — >• a 

f(x,a) — ^ c(x) h — >■ c(h) 

and the derivation of length w + 2: 

f (a,g) — f (c(a) ,g) —>•••• f (c“ ,g) — >• f ,a) — >• c‘^ 

No strongly convergent reduction rewrites f (a,g) into the infinite term c“. 

* This work has been partially supported by CICYT TIC 98-0445-C03-01. 
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Transfinite, strongly convergent sequences can be compressed into infinitary ones 
when dealing with left-linear TRSs |KKSV9,^ . This is not true for Cauchy con- 
vergent sequences: e.g., there is no Cauchy convergent infinitary sequence reduc- 
ing f (a,g) into c‘^ in Example ^ Because of this, Kennaway et al. argue that 
strongly convergent transfinite rewriting is the best basis for a theory of trans- 
finite rewriting IKKMV9,^I . However, Example ^ shows that the restriction to 
strongly convergent sequences may lose computational power, since many possi- 
bly useful sequences are just disallowed. Thus, from the semantic point of view, 
it is interesting to compare them further. 

In this paper, we especially consider the computation of possibly infinite con- 
structor terms by (both forms of) transfinite rewriting. In algebraic and func- 
tional languages, constructor terms play the role of completely meaningful pieces 
of information that (defined) functions take as their input and produce as the 
outcome. We prove that every infinitary rewrite sequence leading to a construc- 
tor term is strongly convergent. We prove that, for left-linear TRSs, transfinite 
rewrite sequences leading to finite terms can always be compressed into finite 
rewrite sequences and that infinite terms obtained by transfinite rewrite se- 
quences can always be finitely (but arbitrarily) approximated by finitary rewrite 
sequences. We have investigated the role of finitary confluence in transfinite 
rewriting. We prove that for left-linear, (finitary) confluent TRSs, Cauchy con- 
vergent transfinite rewritings leading to infinite constructor terms can always be 
compressed into infinitary ones. We also prove that finitary confluence ensures 
the uniqueness of infinite constructor terms obtained by infinitary rewriting. We 
use our results to define and compare rewriting semantics. By a rewriting se- 
mantics we mean a mapping from input terms to sets of reducts obtained by 
finite, infinitary, or transfinite rewriting. We study different rewriting semantics 
and their appropriateness for computing different kinds of interesting semantic 
values in different classes of TRSs. We also investigate two orderings between 
semantics that provide an algebraic framework for approximating semantic prop- 
erties of TRSs. We motivate this framework using some well-known problems in 
term rewriting. 

Section E] introduces transfinite rewriting. Section 0 deals with compression. 
Section 0 investigates the role of finitary confluence in transfinite rewriting com- 
putations. Section 0 introduces the semantic framework and Section 0 discusses 
its use in approximating properties of TRSs. Section Q compares different se- 
mantics studied in the paper. Section 0 discusses related work. 

2 Transfinite Rewriting 

Terms are viewed as labelled trees in the usual way. An (infinite) term on a 
signature A is a finite (or infinite) ordered tree such that each node is labeled by a 
symbol f G S and has a tuple of descendants, and the size of such a tuple is equal 
to ar{f) (see for a formal definition) . When considering a denumerable 

set of variables A, we obtain terms with variables in the obvious way. The set 
of (ground) infinite terms is denoted by T“(A, A) (resp. T“(A)) and T{S,X) 
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(resp. T{S)) is the set of (resp. ground) finite terms. Notations T°°{S,X) or 
T°°{S) are more frequent; by using w, we emphasize that transfinite terms are 
not considered. This is more consistent with the use of u and oo in the paper. 

By Pos{t), we denote the set of positions of a term t, and \p\ is the length of 
a position p. By A, we denote the empty chain. The height dj of t G T{S,X) is 
given by d* = l+max{{\p\ \ p G Vos{t)}). We use the following distance d among 
terms: d{t, s) = 0 if t = s; otherwise, d{t, s) = (where p{t, s) is the length 

\p\ of the shortest position p G Vos{t) fl Vos{s) such that root{t\p) yf root{s\p) 
[K JoiiSfij ). Therefore, {T‘^{E,X),d) is a metric space. Note that, if t G T{S) 
and e < 2”*^*, then Vs G T“(N'), d(t, s) < e 4A t = s. A substitution is a 
mapping tr : df — >■ T^{S,X) which we homomorphically extend to a mapping 
a : T‘^{E,X) — >■ T‘^{S,X) by requiring that it be continuous w.r.t. the metric 
d. A rewrite rule is an ordered pair (l,r), written I ^ r, with l,r G T{S,X), 
I ^ X and Var(r) C Var{l). The left-hand side {Ihs) of the rule is I and r 
is the right-hand sid^ (rhs). A TRS is a pair TZ = {S,R) where i? is a set 
of rewrite rules. Given TZ = (A, i?), we consider E as the disjoint union E = 
C l±l iF of symbols c € C, called constructors and symbols f € if, called defined 
functions, where TF = {f \ f(J) — >■ r G R} and C = E — T . Then, T(C, X') (resp. 
T(C),T“(C)) is the set of (resp. ground, possibly infinite, ground) constructor 
terms. A term t G T^{E, X) is a redex if there exist a substitution a and a rule 
I — >■ r such that t — a{l). A term t G T‘^{E,X) rewrites to s (at position p), 
written t s (or just t — >■ s), if t\p = a{l) and s = t[a{r)]p, for some rule 
p : I ^ r G R, p G 'Pos{t) and substitution a. This can eventually be detailed by 

writing t s (substitution a is uniquely determined by t\p and 1). A normal 
form is a term without redexes. By NFt^ (NF^) we denote the set of finite (resp. 
possibly infinite) ground normal forms of TZ. A term t is root-stable (or a head 
normal form) if Vs, if t — >■* s, then s is not a redex. By HNFt^ (HNF)^) we denote 
the set of ground root-stable finite (resp. possibly infinite) terms of TZ. In the 
following, we are mainly concerned with ground terms. 

A transfinite rewrite sequence of a TRS 72. is a mapping A whose domain 
is an ordinal a such that A maps each (3 < a to & reduction step Ap — >■ A^_|_i 
iiKksvfi,»;i . If a is a limit ordinal, A is called open; otherwise, it is closed. If 
a G {u!,uj + I}, we will say that A is infinitary, rather than transfinite. The 
length of the sequence is a if a is a limit ordinal; otherwise, it is a — 1. For 
limit ordinals j3 < a, the previous definition of transfinite rewrite sequence does 
not stipulate any relationship between Ap and the earlier terms in the sequence. 
Thus, the following notion is helpful IKKbV951 : Given a distance d on terms, a 
rewriting sequence A is said to be Gauchy continuous if for every ordinal limit 
A < a, Ve > 0,3/3 < A,Vq (/3 < 7 < A d{A.y,A\) < e). Given a reduction 
sequence A, let dp be the length of the position of the redex reduced in the 
step from Ap to Ap^i. A Gauchy continuous closed sequence A is called Cauchy 
convergent. A Gauchy convergent sequence is called strongly continuous if for 
every limit ordinal A < a, the sequence {dp)p<^x tends to infinity. If A is strongly 



1 In EESSSa, infinite right-hand sides are also allowed. 
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continuous and closed, then S is strongly convergent. We write: A : t s 
(resp. A : t — s) for a Cauchy (resp. strongly) convergent sequence of length a 
starting from t and ending at s. We write: A : t s (resp. A : t — s) for 

a Cauchy (resp. strongly) convergent sequence starting from t and ending at s 
whose length is less than or equal to a; moreover, we often take the initial term 
t of A as its name and write for an ordinal (3 < a rather than Ap. We write 
t s (resp. t s) if we do not wish to explicitly indicate the length of the 
sequence. 

3 Compression of Transfinite Rewrite Sequences 

The following result shows that, for computing constructor terms, Cauchy con- 
vergent and strongly convergent infinitary rewriting are equivalent. 

Theorem 1. Let TZ he a TRS, t £ T“(T’) and S £ T~‘^{C). Every Cauchy con- 
vergent infinitary rewrite sequence from t to 6 is strongly convergent. 

Proof. Let t S. Since no rule may overlap any constructor prefix of 6, O /3 for 
redexes contracted in each step t^ —> t/ 3 +i must tend to infinity as the normal 
form does. 

Example n shows that, in general. Theorem ^ does not hold for transfinite se- 
quences. Moreover, Theorem ^ does not hold for arbitrary normal forms. 

Example 2. Consider the orthogonal (infinite) TRS fKKSVD.’ij : 

f(g"(c)) — ^ f(g"+^(c)) for n > 0 
a — g(a) 

Thus, f (c) f (g‘^)£ NF)^, but f (c) f (g“). 

Kennaway et al. proved the following: 

Theorem 2. |KKS V9fi] Let TZ he a left-linear TRS and t,s G T^{S) . Lft — s, 
then t — s. 

Theorem 121 does not hold for Cauchy convergent reductions. 

Remark 1. The compression of Cauchy convergent sequences into infinitary ones 
has been studiecH in pKPm\ . where top -termination (i.e., no infinitary reduc- 
tion sequence performs infinitely many rewrites at A), is required to achieve it. 
Kennaway et al. noticed that it implies that ^‘‘every reduction starting from a 
finite term is strongly convergent thus arising as a consequence of Theorem El 

For Cauchy convergent sequences, we prove several restricted compression prop- 
erties. 

Proposition 1. Let TZ he a TRS, t £ T“(T') and s he a finite term. Lf t s 
for a limit ordinal X, then t s for some (3 < X. 



^ In the terminology of Ilikbldl . o;-closure. 
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Proof. If t s, then Ve > 0, 3/3^, V 7 , /3e < 7 < A, d{tj, s) < e. Since s is finite, 
we let e = Then, V 7 , /3e < 7 < A, it must be = s. In particular, t s 

for /3 = /3e + 1 < A (since A is a limit ordinal) . 

In Theorem below, we prove that, for left-linear TRSs, finite pieces of infor- 
mation obtained from transfinite rewriting can always be obtained from finitary 
sequences. In the following auxiliary result, we use dp = max(d/,dr) for a rule 
p : Z — ?> r (note that dp only makes sense for rules whose rhs’s are finite). 

Proposition 2. Let TZ be a left-linear TRS. Let t,s G T'^{S) be such that t = 

ti ^ g jQ^ p g T‘^{S) such 

that d{tf t) < lpd+d„)-K/og 2 («)^ ^^ig^g g r^[E) such that t' s' 

and d{s', s) < k. 

Proof. By induction on n. If n = 0, then t = s and d(t',s) < 
2 *off 2 (K) _ ^gi; g' — f' ^}jg conclusion follows. If n > 0, 

then, since 0 < k < 1 , we have d{t',t) < < 

2“l^’il”'^'i . Since TZ is left-linear, it follows that t'\p^ is a redex of pi. 
Thus, t' — >■ t'[a'i(ri)]p^ and for all x G Var(Zi), d(cr((a:), tri(a;)) < 

Hence, since Var(ri) C Var(Zi) and dp,^ = 
maa;(dq,d^J, we have d(t'[cr'i(n)]p,,t 2 ) < 2"(^"=2 IPd+dpJ-|pihd,.,+ios 2 («) 
< 2 “(^i =2 By the induction hypothesis, t'[a'i{ri)]p.^ — >■* s' and 

d{s' , s) < At; hence t' — >■* s' and the conclusion follows. 

Proposition 3. Let TZ be a left-linear TRS, t,s G T‘^(Af), and t' G — 

T{S). Lf t t' — >■* s for a limit ordinal X, then for all k G ]0, 1], there exist 
s' G T“(Af) and fi < X such that t s' and d{s' , s) < k. 

Proof. Since A is a limit ordinal and t' is infinite, for all e > 0, there exists 
/3g < A such that, for all 7 , /3g < 7 < A, d{tj,t') < e. Let t' = ti t 2 — >■ 

■ ■ ■ In = s, e = lpd+dpi)-i-/og 2 (K)^ _l_ X. Thus, 

d{tpg,t') < e and, by Proposition Q t/ 3 o s' and d{s',s) < k. Thus, t s' 
for P = Po + n + 1 < X and the conclusion follows. 

Theorem 3. Let TZ be a left-linear TRS and t,s G Lf t s, then for 

all K G ]0, 1], there exists s' G 7~‘^(XT) such that t — >■* s' and d{s' , s) < k. 

Proof. Let t s for an arbitrary ordinal a. We proceed by transfinite induc- 
tion. The finite case a < u (including the base case a = 0) is immediate since 
t — >■* s and d{s,s) = 0 < k, for all k G ]0,1]. Assume a > ui. We can write 
t t' s for some limit ordinal A < a. If t' is finite, by Proposition [Q 
3/3 < A such that t t' . Thus, we have t s where n is the length of the 

finite derivation t' — >■* s. Since A is a limit ordinal, P < X, and n G N, we have 
P -\- n < X and by the induction hypothesis, the conclusion follows. If t' is not 
finite, by Proposition 0 there exist s' G T"^{S) and P < X such that t s' 
and d{s', s) < k. Since P < a, hy the induction hypothesis, t — >■* s' and the 
conclusion follows. 
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Corollary 1. Let TZ be a left-linear TRS, t € T^{S), and s be a finite term. If 
t s, then t — >•* s. 

The following example shows the need for left-linearity to ensure Corollary E 
Example 3. Consider the TRS |DKR91| : 

a g(a) b -T g(b) f(x,x) — >• c 

Note that f (a, b) — f(g‘^,g‘^) — i c, but f (a,b) c. 



4 Finitary Confluence and Transflnite Rewriting 



A TRS TZ is (finitary) confluent if — >■ is confluent, i.e., for all t,s,s' G T(A, ft), 
if t — >■* s and t — >■* s', there exists u € T{E,X) such that s — >■* u and s' — >■* u. 
Finitary confluence of — >■ for finite terms extends to infinite terms. 



Proposition 4. Let TZ be a confluent TRS and t,s,s' G T^{E,X). If t — >■* s 
and t — >■* s', then there exists u G T“(A,ft) such that s — >■* u s'. 



Proof, (sketch) If t is finite, then it is obvious. If t is infinite, we consider the 









t2 — >• 



” 4 " 



, j j. j.f fPi’Pil 1 / 

^n +1 = s and t = t\ — >• T 2 



m+l 



= s'. Let P =\+ x;r=i b*l + dpi + E™ 1 ¥i\ + dp' and t 



derivations t 

^ t'r 

be the finite term obtained by replacing in t all subterms at positions qi, . . . ,qk 
such that |gi| = P, for 1 < f < fc, by new variables Xi,...,xi, i < k. Let 
: {1, . . . , A:} — >• {1, . . . ,f} be a surjective mapping that satisfies n{i) = v{j) 
t\pi = t\p. for 1 < i < j < fc. Thus, each variable x G {x\, . . . ,X£} will name 
equal subterms at possibly different positions. It is not difficult to show that 
3s, s' G T{S,X) such that t — >•* s, t — >•* s', and s = cr(s), s' = cr(s') for a 
defined by cr(a;,^(q) = for 1 < * < fc. By confluence, 3u G T{E,X) such that 



s — >■* u *G- s' and by stability of rewriting, s = ct ( s ) — >■* a{u) *G- cr(s') = s'. 
Thus, u = cr(u) is the desired (possibly infinite) term. 



Proposition 21 could fail for TRSs whose rhs’s are infinite (see [KKSVf),^ . Coun- 
terexample 6.2). Concerning the computation of infinite constructor terms, we 
prove that, for left-linear, confluent TRSs, infinitary sequences suffice. First we 
need an auxiliary result. 



Proposition 5. Let TZ be a left-linear, confluent TRS, t G T^{E), and 6 G 
T“(C)— T(C). Ift 5, then (1) ^s G NFtj such thatt — >■* s, (2) '^5' G T^{C), 
S S' such that t S' . 

Proof. (1) Assume that t — >•* s G NFtj and let n = Since t S, by 

TheoremEl there exists s' such that t — >■* s' and d{s',S) < n. Since s G NFt^, 
by confluence, s' — >■* s which is not possible as s' has a constructor prefix 
whose height is greater than the height of s. (2) Assume that t S' for some 
S' G T^{C), S yf S'. Let K = d{S,S')/2. By Theorem 0 3s, s' such that t — >■* s, 
t — >■* s', d{s, S) < K, and d{s', S') < k. By Proposition^ there exists u such that 
s — >■* u *G- s'. However, this is not possible since there exists p G Pos(s)nPos(s') 
such that root{s\p) = c yf c' = root{s'\p) and c, c' G C. 
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Theorem 4. Let TZ be a left-linear, eonfluent TRS, t € T^(E), and S G T^iC). 
Ift^°° 5, then t^^^ 5. 

Proof. If 6 G T(C), it follows by Corollary QJ Let S G T^{C) — T{C). Assume 
that t S. Then, by Proposition 02), there is no other infinite construc- 

tor term reachable from t. Thus, we can associate a number Mc{A) G N 
to every (open or closed) infinitary sequence A starting from t, as follows: 
Mc{A) = maa;({mc(s) | s G A}) where nic(s) is the minimum length of 
maximal sub-constructor positions of s (i.e., positions p G Vos{s) such that 
Vg < p,root{s\q) G C). Note that, since we assume that no infinitary derivation 
A converges to <5, Mc{A) is well defined. We also associate a number n{A) G N 
to each infinite derivation A to be n{A) = min{{i G N | mc{Ai) = Mc{A)}), 
i.e., from term on, terms of derivation A does not increase its me number. 

We claim that Mc(t) = {Mc{A) | A is an infinitary derivation starting from t} 
is bounded by some N G N. In order to prove this, we use a kind of ‘diagonal’ 
argument. If Mc(t) is not bounded, there would exist infinitely many classes A 
of infinitary derivations starting from t such that VA, A! G A, Mc{A) = Mc{A') 
(that we take as Mc(A) for the class A) that can be ordered by A < A' 
iff Me{A) < Me{A'). Thus, classes of infinite derivations starting from t can 
be enumerated by 0,1,... according to their Me (A). Without losing gener- 
ality, we can assume Mc(Aq) > mc{t). Consider an arbitrary A G Ai for 
some f > 0. By confluence and Proposition Hi), there exist a least j > i 
and a derivation A' G Aj such that the first n(A) steps of A and A' coin- 
cide. We say that A continues into A' . Obviously, since Mc{Ai) < Mc{Aj), 
and the first n(A) steps of A and A' coincide, we have n(A) < n(A'). Once 
A^ G Ao has been fixed, by induction, we can define an infinite sequence 
A : A°, A^, . . . , A", . . . of infinitary derivations such that, for all n G N, A” 
continues into A"+^. For such a sequence A, we let i : N — N be as follows: 
t(n) = min{{m \ n < n(A'")}). We define an infinite derivation A starting from 
t as follows: V/3 < to, — >■ A‘j^^\. It is not difficult to see that A is well defined 

and, by construction, Mc{A) is not finite thus contradicting that t 6. Thus, 

Mc(i) = {Mc{A) I A is an infinitary derivation starting from t} is bounded by 
some N G N. Let n = By Theorem El there exists s such that t — >■* s and 

d(s,S) < K. Since PropositionEI(l) ensures that every finite sequence t — s can 
be extended into an infinitary one, it follows that A : t — >■* s • • • satisfies 
Mc{A) > N, thus contradicting that N is an upper bound of Mc(t). 

Since TZ in Example [D is not confluent, it shows the need for confluence in order 
to ensure Theorem^. Theorem0does not generalize to arbitrary normal forms. 

Example 4- Consider the following left-linear, confluent TRS: 

h(f(a(x),y)) — h(f(x,b(y))) f(a(x),b(y)) — ^ g(f(x,y)) 

h(f(x,b(y))) h(f(a(x),y)) g(f(x,y)) — ^ f(a(x),b(y)) 

The transfinite rewrite sequence 

hCfCa-^.y)) h(f(a‘",b‘")) h(g‘") 

cannot be compressed into an infinitary one. 
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With regard to Remark ^ we note that the conditions of Theorem 0 do not 
forbid ‘de facto’ non-strongly convergent transfinite sequences (as in inKESl). 

Examples. Consider TRS 7?. of Example |2|plus the rule h(x) — > b(h(x)), and 
the Cauchy convergent reduction sequence of length uj ■ 2\ 

h( f (c) ) — )► h(f(g(c))) — >• h(f (g(g(c) ) ) ) h(f(g“)) — b“ 

which is not strongly convergent. Theorem^ensures the existence of an (obvious) 
Cauchy convergent infinitary sequence leading to b‘^ (which, by Theorem ^ is 
strongly convergent). However, results in fDKJr*91| do not apply here since the 
TRS is not top-terminating. 

A TRS TZ = (A, R) is left-bounded if the set {d; | Z — >• r G i?} is bounded. 

Theorem 5. [KKS V 9'^ Let TZ be a left-bounded, orthogonal TRS, t G T“(A) 
and s G NF^. If t s, then t — s. 

By Theorem 121 we conclude that if t s G NF^, then t s. Thus, for 
TRSs with finite rhs’s. Theorem El along with Theorem Q improves Theorem 0 
with regard to normal forms of a special kind: constructor terms. In particular, 
note that Theorem 0 along with Theorem | 2 | does not apply to Example 0 

The uniqueness of infinite normal forms obtained by transfinite rewriting is a 
consequence of the confluence of transfinite rewriting. Even though orthogonality 
does not imply confluence of transfinite rewriting, Kennaway et al. proved that 
it implies uniqueness of (possibly infinite) normal forms obtained by strongly 
convergent reductions. We prove that finitary confluence implies the uniqueness 
of constructor normal forms obtained by infinitary rewriting. 

Theorem 6. Let TZ be a confluent TRS, t G T^{E), and 6,6' G T^{C). If 
t 6 and t 6', then 6 = 6'. 

Proof. If t — >■* 6 and t — >■* 6', it follows by Proposition 0 Assume that t 6 
and 6 6'. In this case, 6 is necessarily infinite. Let e = d{6,6') and p G T’os{6) 

be such that \p\ = —log 2 {e). Since t 6, there exists G N such that, Vm, 
He < m < ui, d{tm,6) < e. Let m = -F 1; thus, t — )•* If t — >■* 6', then, 
by Proposition 0 tm — t* 6'. However, since root{tm\p) = root{6\p) G C, and 
root{6\p) root{6'\p), rewriting steps issued from t^ cannot rewrite tm into 6', 
thus leading to a contradiction. On the other hand, iit = t' 6', we consider a 
term t'^, such that d{t'^,,6') < e whose existence is ensured by the convergence 
of t = t' into 6' . The confluence of t'.,^, and tm into a common reduct s is also 
impossible for similar reasons. 

Theorem 0 does not hold for arbitrary normal forms. 

Example 6. Consider the (ground) TRS TZ: 
f(a) — a f(a) -T f(f(a)) 

Since t — a for every ground term t, TZ is ground confluent and hence confluent 
(see |BJN98j l. We have f (a) — f ( f (a) ) f(f(f(a))) — >•“ G NF^, but 

also f (a) — >• a G NF^. 

For left-linear TRSs, Theorem 0 generalizes to transfinite rewriting. 



224 



S. Lucas 



Theorem 7. Let TZ be a left-linear, eonfluent TRS, t G and 6,6' G 

T“(C). Ift^°° 6 and t 6', then 6 = 6'. 

5 Term and Rewriting Semantics 

Giving semantics to a programming language (e.g., TRSs) is the first step for 
discussing the properties of programs. Relating semantics of (the same or dif- 
ferent TRSs) is also essential in order to effectively analyze properties (via ap- 
proximation). We provide a notion of semantics which can be used with term 
rewriting, and investigate two different orderings between semantics aimed at 
approximating semantics and (hence) at properties of programs. Our notion of 
semantics is aimed at couching both operational and denotational aspects in the 
style of eomputational, algebraic, or evaluation semantics |Bou85ICou90IPie7| . 
Since many definitions and relationships among our semantics do not depend on 
any computational mechanism, we consider rewriting issues later lSection l5.il) . 

Definition 1. A (ground) term semantics for a signature S is a mapping S : 

r{s)^v{T^{E)). 

A trivial example of term semantics is empty given by Vt G T(T'),empty(t) = 0. 
We say that a semantics S is deterministic (resp. defined) if Vt G T{E), |S(t)| < 1 
(resp. |S(t)| > 1). Partial order C among semantics is the pointwise extension of 
partial order C on sets of terms: S C S' if and only if Vt G T{E),S{t) C S'{t). 
Another partial order ^ is defined: S ^ S' if there exists T C such that, 

for all t G T{E), S{t) — S'{t)DT. Given semantics S and S' such that S ^ S', any 
set T C T‘^(A) satisfying S{t) — S'{t) D T is called a window set of S' w.r.t. S. 
Glearly, S ^ S' implies S C S' (but not vice versa). Note that empty ^ S for all 
semantics S. The range W$ = UtGT(i;) ^ semantics S suffices to compare 

it with all the others. 

Theorem 8. Let S,S' be semantics for a signature E. Then, S ^ S' if and only 
ifVt G T{E), S{t) = S'{t) n Ws. 



Proposition 6. Let S,S' be semantics for a signature E. (1) Lf S E S', then 
VPs C Ws'. (2) lfS< S', then Ws C Ws'- 



Remark 2. As VPs is the union of possible outputs of semantics S, it collects the 
‘canonical values’ we are interested in. Hence, it provides a kind of computational 
reference. S' is ‘more powerful’ than S whenever S C S'. However, if S C S' but 
S ^ S', there will be input terms t for which S is not able to compute interesting 
values (according to VPs!) which, in turn, will be available by using S'. Thus, 
S ^ S' ensures that (w.r.t. semantic values in kPs) we do not need to use S' to 
compute them. 

The following results that further connect ^ and C will be used later. 



Transfinite Rewriting Semantics for Term Rewriting Systems 225 



Proposition 7. Let Si, S^, 52,82 be semantics for a signature S. //Si ^ S 2 , 
Wsj C Ws2, Si ^ S'l, and S2 ^ S2, then S'l S2. 

Proposition 8 . Let S, S', S" be semantics for a signature S. Lf S ^ S' and 

5 C S" C S', then S ^ S". 

5.1 Rewriting Semantics 

Definition n] does not indicate how to associate a set of terms S{t) to each term 
t. Rewriting can be used for setting up such an association. 

Definition 2. A (ground) rewriting semantics for a TRS TZ = (if, R) is a term 
semantics S for E such that for all t G T{S) and s G S(t), t s 

A finitary rewriting semantics S only considers reducts reached by means of a 
finite number of rewriting steps, i.e., Vt S T{E),s S S{t) t — >•* s. Given 
a TRS TZ, semantics red, hnf, nf, and eval (where Vt S T{E), red(t) = {s | 
t — >■* s}, hnf(t) = red(t) fl HNFtj, nf(t) = hnf(t) fl NFtj, and eval(t) = nf(t) fl 
T(C)) are the most interesting finitary rewriting semantics involving reductions 
to arbitrary (finitely reachable) reducts, head-normal forms, normal forms and 
values, respectively. In general, if no confusion arises, we will not make the 
underlying TRS TZ explicit in the notations for these semantics (by writing redi^, 
hnfT^, nfT^, . . . ). Concerning non-finitary semantics, the corresponding (Cauchy 
convergent) infinitary counterparts are: w— red, tu— hnf, w— nf, and w— eval using 
HNF)^, NF^, and T“(C). For a given TRS TZ, we have eval ^ nf ^ 
hnf ^ red and w— eval ^ w— nf ^ w— hnf ^ w— red with Wred = {s S T~{E) \ 
3t e T{E),t s}, FFhnf = HNFt^, Wnf = NFt^, and Wevai = T(C). Also, 
R/;-red = {s e T“(r) I 3t e T(E),t sj, M 4 ;_hnf = VF<.-red H HNF^, 

IFw-nf = W/j-red H NF^, and IF/j-evai = R/j-red H {C) . We also consider the 
strongly convergent versions w— Sred, w— Shnf, w— Snf, and w— Seval using 
As a simple consequence of Theorem El we have: 

Theorem 9. For all TRS, tu— eval = w— Seval. 

We consider the following (Cauchy convergent) transfinite semantics: 
00— red(f) = {s G T‘^{E) \ t s}, 00— hnf(t) = 00— red(f) fl HNF^, 

00— nf(f) = 00— hnf(t) n NF^, and 00— eval(f) = 00— nf(f) fl T“(C) (note that 
these definitions implicitly say that 00— eval ^ 00— nf ^ 00— hnf ^ 00— red). The 
^strongly convergent’ versions are 00— Sred, 00— Shnf, 00— Snf, and 00— Seval using 
— instead of 

6 Semantics, Program Properties, and Approximation 

Semantics of programming languages and partial orders between semantics can 
be used to express and analyze properties of programs. For instance, definedness 
of (term) semantics is obviously monotone w.r.t. C (i.e., if S is defined and 
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S C S', then S' is defined); on the other hand, determinism is antimonotone 
(i.e., if S' is deterministic and S C S', then S is deterministic). Given a TRS 72., 
definedness of is usually known as (72) being normalizing |ijlND8| : definedness 
of eval-Tj corresponds to the standard notion of ‘completely defined’; definedness 
of oj— evalT^ is ‘sufficient completeness’ of inkPfiii . Determinism of is the 
standard unique normal form (w.r.t. reductions) property determinism 

of oo— SnfT^ is the unique normal form (w.r.t. strongly convergent transfinite 
reductions) property of jK KSVt).^ . 

A well-known property of TRSs that we can express in our framework is 
neededness |hlM] . In PmH 3, Durand and Middeldorp define neededness (for 
normalization) without using the notion of residual (as in ElHH). In order to 
formalize it, the authors use an augmented signature E U {•} (where • is a 
new constant symbol) . TRSs over a signature S are also used to reduce terms in 
T{Eyj {•}). According to this, neededness for normalization can be expressed as 
follows (we omit the proof which easily follows using the definitions in |l )M!I7j ) : 

Theorem 10. Let 72 = {S,R) he an orthogonal TRS. A redex t\p in t G T(A) 
is needed for normalization if and only if nf 7 ^(t[*]p) C U {•}) — T{S). 

Theorem II 1)1 suggests the following semantic definition of neededness. 

Definition 3. Let S be a term semantics for the signature E U {•}. Let 
t G T{E) — Ws and p G Vosft). Subterm t\p is S-needed in t if S(t[*]p) C 
T‘"(ru{.}) -T‘"(r). 

We do not consider any TRS in our definition but only a term semantics; thus, 
we do not require that t\p be a redex but just a subterm of t. The restriction 
t G T{E) — Ws in DefinitionQis natural since terms in Ws already have complete 
semantic meaning according to S. Our notion of nfT^-neededness coincides with 
the standard notion of neededness for normalization when considering an orthog- 
onal TRS 72 and we restrict the attention to redexes within terms. S-neededness 
applies in other cases. 

Example 7. Consider the TRS 72: 

g(x) -> c(g(x)) a — ^ b 

Redex a in g(a) is nf-T^-needed, since nf 7 ^(g(*)) = 0. However, it is not 
w— Seval-Tj-needed, since w— Seval 7 ^(g(*) ) = {c*^}. 

Neededness of redexes for infinitary or transfinite normalization has been studied 
in jKKSVfi,^ for strongly convergent sequences. As remarked by the authors, 
their definition is closely related to standard Huet and Levy’s finitary one. In fact, 
it is not difficult to see (considering Theorems0andP), that, for every orthogonal 
TRS 72, oj— Seval 7 ?,-needed redexes are needed in the sense of IRKSyfifi] . 

Unfortunately, S-neededness does not always ‘naturally’ coincide with other 
well-known notions of neededness such as, e.g., root-neededness |Mid!I7j . 
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Example 8. Consider the TRS TZ: 

a ^ b c — ^ b f(x,b) — ^ g(x) 

and t = f (a, c) which is not root-stable (hence, f (a, c) ^ Whnf)- Since derivation 
f (a,c) — >■ f (a,b) g(a) does not reduce redex a in t, a is not root-needed. 
However, f (*,0) f (•,b) — ?> g(*), which means that a is hnf-needed. 

This kind of problem has already been noticed in |l)Mt)7| . A deep comparison 
between semantic neededness of Definition 01 and other kinds of neededness is 
out of the scope of this paper. However, it allows us to show the use of □ in 
approximation. For instance, S-neededness is antimonotone with regard to C. 

Theorem 11. Let S.S' be term semantics for a signature E such that S C S'. 
If t\p is S' -needed in t € T{E) — Ws>, then t\p is S-needed in t. 

Theorem mi suggests using C for approximating the neededness of a TRS, ei- 
ther by using semantics for other TRSs or other semantics for the same TRS. 
Given TRSs TZ and S over the same signature, we say that S approximates TZ if 
— >-^C— and NFt^ = NF 5 IDlVL97IJac9bl . An approximation of TRSs is a map- 
ping a from TRSs to TRSs with the property that a(ffZf) approximates TZ [I )M97j . 
By using approximations of TRSs we can decide a property of a{TZ) (e.g., needed- 
ness) which is valid (but often undecidable) for the ‘concrete’ TRS TZ. In |DA197j . 
four approximations, namely s, nv,sh, and g, have been studiecH and neededness 
proved decidable w.r.t. these approximations. Since nl-ji C nfafT?,), by Theorem 
mi r\Ia(n) correctly approximates nfT^-neededness (as proved in |DM97| b 

A final example is redundancy of arguments. Given a term semantics S for the 
signature E, the i-th argument of / G A is redundant w.r.t. S if, for all context 
C[], t € T{E) such that root{f) = /, and s S T(A), S(C[t]) = S(C[t[s]i]). 
Redundancy is antimonotone with regard to ^ but not w.r.t. E ISMl- 

7 Relating Transfinite, Infinitary, and Finitary Semantics 

Following our previous discussion about the usefulness of relating different 
semantics, in this section, we investigate orders E and E among semantics 
of Section lb. II Strongly convergent sequences are Gauchy convergent; thus, 
oo—S(p E 00— ip for ip G {red, hnf, nf, eval}. In general, these inequalities are 
strict: consider TZ in Example ^ We have 00— Seval 00— eval; otherwise, 
00— Seval(f (a,g)) = 00— eval(f (a,g)) fl Woo-Sevai- Since c‘^ G 00— eval (f (a, g)) 
but we have that c‘^ ^ 00— Seval(f (a,g)), it follows that ^ FFoo-Sevai- How- 
ever, G 00— Seval(h) = 00— eval(h) fl Woo-Sevai contradicts the latter. Thus, 
in general, 00— Seval ^ 00— eval, i.e., 00— Seval C 00— eval. By Proposition 
lEoo-Sevai E Woo-evai; thus, by Proposition Q we also have 00— Sp 00— tp, 

and hence 00— Si,9 IZ 00— p for p G (red, hnf, nf, eval}. By Theorem El for left- 
bounded, orthogonal TRSs, 00— nf = 00— Snf and 00— eval = 00— Seval. 

With regard to infinitary semantics, the situation is similar to the transfinite 
case (but consider Theorem 0 . We also have the following: 

^ Names s, nv, sh, and g correspond to strong, NV, shallow, and growing approxima- 
tions, respectively. 
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Fig. 1. Semantics for a TRS ordered by ^ 



=o-red 



=o-red 






Fig. 2. Semantics for a left-linear / left-linear confluent / left-bounded orthogonal 
TRS (ordered by <) 

Proposition 9. For all TRS, ^ Lo—ip for ip G {red, hnf, nf, eval}. 

By Propositions El and 13, (p ^ ui—Sip for (p G (red, hnf, nf, eval}. However (con- 
sider the TRS in Example EJ, in general eval oo— Seval (and eval oo— eval)! 
By Proposition □, no comparison (with is possible between (considered here) 
transfinite semantics and infinitary or finitary ones (except empty). Figure [D 
shows the hierarchy of semantics for a TRS ordered by 

By Theorem 0, for left-linear TRSs, we have oo—Sip = u—Sip for ip G 
(red, hnf, nf,eval|. Example 0 shows that w— eval oo— eval and, since w— eval C 

oo— eval, by Propositions El and 0 co—ip oo—ip for ip G (red, hnf, nf, eval}. 
As a simple consequence of Corollary 0 for every left-linear TRS, ip ^ oo—ip 
for ip G {red, hnf, nf, eval}. By additionally requiring (finitary) confluence, The- 
orem 0 entails that, for every left-linear, confluent TRS, oo— eval = w— eval. 
By using Theorem 0 we have that, for every left-bounded, orthogonal TRS, 
oo— nf = w— Snf. Diagrams of Figure 0 summarize these facts. Table 0 shows 
the appropriateness of different semantics for computing different kinds of in- 
teresting semantic values (see Remark 0. Left-linear, confluent TRSs provide 
the best framework for computing constructor terms, since infinitary, strongly 
convergent sequences suffice and determinism of computations is guaranteed. 
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Table 1. Semantics for computing different canonical forms; (!) means determinism 





T(C) 


NFn 


T-(C) 


NF^ 


arbitrary TRSs 


oo— eval 


oo— nf 


00— eval 


00— nf 


left-linear TRSs 


eval 


nf 


00— eval 


00— nf 


left-linear, confluent TRSs 


eval (!) 


nf (!) 


O'- Seval (!) 


00— nf 


left-bounded, orthogonal TRSs 


eval (!) 


nf (!) 


O'- Seval (!) 


cn— Snf (!) 



8 Related Work 

Our rewriting semantics are related to other (algebraic) approaches to semantics 
of recursive program schemes |Cou9n) and TRSs jijou85| . For instance, a com- 
putational semantics, Comp ij^ j^~^{t), of a ground term < in a TRS TZ — {S,R) 
is given in |Roii8R) as the collection of lub’s of increasing partial information 
obtained along maximal computations starting from t (see pijou85| . page 212). 
The partial information associated to each component of a rewrite sequence is 
obtained by using an interpretation ^ of 7?. which is a complete 27-algebra0 sat- 
isfying = 1. for all rules I ^ r G R. Consider TZ in Example El and let .4 be a 
27-algebra interpreting TZ. Since f (a) — >■ a is a rule of TZ, then /(a) = _L, where 
/ and a interpret f and a, respectively. Since _L C a and / is monotone, we 
have /(/(a)) = /(_L) C /(a) = _L, i.e., /(/(a)) = _L. In general, /"(a) = _L, for 
n > 1, and thus Comp^^^^^{f (&)) = {_L,a}. However, w— Snf(f (a)) = {a,f‘^}. 
Moreover, since TZ in Example El is confluent, according to , we can pro- 
vide another semantics (t) which is the lub of the interpretations of all 

finite reducts of t. In particular, (f (a) ) = |J{_L,a} = a, which actually 

corresponds to nf(f(a)), since there is no reference to the infinitary term 
obtained from f (a) by strongly convergent infinitary rewriting. 

Closer to ours, in jHIRI ITjiicflfl| . an observable semantics is given to (compu- 
tations of) terms without referring its meaning to any external semantic domain. 
An observation mapping is a lower closure operator on i7-terms ordered by a 
partial order C on them. An adequate observation mapping (] |) permits us to 
describe computations which are issued from a term t as the set of all obser- 
vations (]s[) of finitary reducts s of t. An observation mapping d [) is adequate 
for observing rewriting computations if the observations of terms according to 
d D are refined as long as the computation proceeds. For instance, the observa- 
tion mapping d Otj that yields the normal form of a term w.r.t. Huet and Levy’s 
C-reductions is adequate for observing rewriting computations fLucOOj . 

However, using d [la; to observe computations issued from f (a) in the TRS of 
Example Eldoes not provide any information about f“, since only the set {17, a} 
is obtained. 

In [DKP9I| . continuous 27-algebras (based on quasi-orders) are used to pro- 
vide models for ^-canonical TRSs, i.e., tu-confluent and ^-normalizing (that is. 



By a complete 27-algebra we mean A — {D, C, T, {/^ | / £ 27}) where {D, C, T) is 
a cpo with least element T and each /a is continuous (hence monotone) |HoiiSh| . 
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every term has a normal forrr0 which is reachable in at most u;-steps) TRSs. 
Models are intended to provide inequality of left- and right-hand sides rather 
than equality, as usual. Also, in contrast to j H0118RIH I i)1 II jucOOj . semantics is 
given to symbols (via interpretations) rather than terms (via computations). 
This semantic framework does not apply to TZ in Example El since it is not 
cc-canonical (it lacks w-confluence). 



Acknowledgements. I thank the anonymous referees for their remarks. Ex- 
ample 0| was suggested by a referee. 
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Abstract. We give a general goal directed method for solving the E- 
unification problem. Our inference system is a generalization of the in- 
ference rules for Syntactic Theories, except that our inference system is 
proved complete for any equational theory. We also show how to easily 
modify our inference system into a more restricted inference system for 
Syntactic Theories, and show that our completeness techniques prove 
completeness there also. 



1 Introduction 

iil-unification is a problem that arises in several areas of computer science, 
including automated deduction, formal verification and type inference. The prob- 
lem is, given an equational theory E and a goal equation u ^ v,to find the set of 
all substitutions 9 such that u9 and v0 are identical modulo E. In practice, it is 
not necessary to find all such substitution. We only need to find a set from which 
all such substitutions can be generated, called a complete set of E-unifiers. 

The decision version of if-unification (Does an if-unifier exist?) is an unde- 
cidable problem, even for the simpler word problem which asks if all substitutions 
9 will make u9 and v9 equivalent modulo E. However there are procedures which 
are complete for the problem. Complete, in this sense, means that each if-unifier 
in a complete set will be generated eventually. However, because of the undecid- 
ability, the procedure may continue to search for an if-unifier forever, when no 
if-unifier exists. 

One of the most successful general methods for solving the if-unification 
problem has been Knuth-Bendix Completion ^2] (in particular, Unfailing 
Completion |2j) plus Narrowing 0. This procedure deduces new equalities from 
E. If the procedure ever halts, it solves the word problem. However, because of 
the undecidability, Knuth-Bendix Completion cannot always halt. 

Our goal in this paper is to develop an alternative U-unification procedure. 
Why do we want an alternative to Knuth-Bendix Completion? There are several 
reasons. First, there are simple equational theories for which Completion does 
not halt. An example is the equational theory E = {f{g{f{x))) ~ g{f{x))}. So 
then it is impossible to decide any word problem in this theory, even a simple 
example like a ~ 6, which is obviously not true. Using our method, examples 
like this will quickly halt and say there is no solution. 

** This work was supported by NSF grant number CCR-9712388. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 2.S1- TH^ 2001. 
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A related deficiency of Completion is that it is difficult to identify classes of 
equational theories where the procedure halts, and to analyze the complexity 
of solving those classes. That is our main motivation for this line of research. 
We do not pursue that subject in this paper, since we first need to develop a 
complete inference system. That subject is addressed in inng, where we deal 
classes of equations where the A-unification is decidable in an inference system 
similar to the one given in this paper. 

Another aspect of Completion is that it is insensitive to the goal. It is possi- 
ble to develop heuristics based on the goal, but problems like the example above 
still exist, because of the insensitivity to the goal. The method we develop in this 
paper is goal directed, in the sense that every inference step is a step backwards 
from the goal, breaking the given goal into separate subgoals. Therefore we call 
our method a goal directed inference system for equational reasoning. This qual- 
ity of goal-directedness is especially important when combining an equational 
inference system with another inference system. Most of the higher order in- 
ference systems used for formal verification have been goal directed inference 
systems. Even most inference systems for first order logic, like OTTER, are of- 
ten run with a set of support strategy. For things like formal verification, we 
need equality inference systems that can be added as submodules of previously 
existing inference systems. We believe that the best method for achieving this is 
to have a goal directed equality inference system. 

We do not claim that our procedure is the first goal directed equational in- 
ference system. Our inference system is similar to the inference system Syntactic 
Mutation first developed by Claude Kirchner Pfrn) . That inference system ap- 
plies to a special class of equational theories called Syntactic Theories. In such 
theories, any true equation has an equational proof with at most one step at 
the root. The problem of determining if an equational theory is syntactic is 
iindecida,hle[l I j. In the Syntactic Mutation inference system, it is possible to de- 
termine which inference rule to apply next by looking at the root symbols on the 
two sides of a goal equation. This restricts which inference rules can be applied 
at each point, and makes the inference system more efficient than a blind search. 

Our inference system applies to every equational theory, rather than just 
Syntactic Theories. Therefore, it would be incomplete for us to examine the root 
symbol at both sides of a goal equation. However, we do prove that we may ex- 
amine the root symbol of one side of an equation to decide which inference rule 
to apply. Other than that, our inference system is similar to Syntactic Mutation. 
We prove that our inference system is complete. The Syntactic Mutation rules 
were never proved to be complete. In 0, it is stated that there is a problem 
proving completeness because the Variable Elimination rule (called “Replace- 
ment” there) does not preserve the form of the proof. We think we effectively 
deal with that problem. 

There is still an open problem of whether the Variable Elimination rule can 
be applied eager lyQ We have not solved that problem. But we have avoided those 
problems as much as possible. The inefficiency of the procedure comes from cases 



^ See |lt)| for a discussion of the problem and a solution for a very specific case. 
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where one side of a goal equation is a variable. We prove that any equation where 
both sides are variables may be ignored without losing completeness. We also 
orient equations so that inference rules are applied to the nonvariable side of an 
equation. This gives some of the advantages of Eager Variable Elimination. 

Other similar goal-directed inference procedures are given in |4I5| . The in- 
ference system from 0 is called BT. The one in is called Trans. The main 
difference between our results and the results in those papers all of our inference 
rules involve a root symbol of a term in the goal. This limits the number of 
inference rules that can be applied at any point. For BT and Trans there are 
inference rules that only involve variables in the goal. These rules allow an explo- 
sion of inferences at each step, which expands the search space. This is similar to 
the situation in Paramodulation completeness proofs required paramodulation 
into variables until Brand 13 proved that this was not necessary for complete- 
ness. We believe that the completeness results in this paper are analogous to the 
results of Brand, but for goal-directed E-unification. In the case of Paramodula- 
tion, the results of Brand prove essential in practice. Another difference between 
our results and BT and Trans is that those papers require Variable Elimination, 
while ours do not. Gallier and Snyder P] pointed out the problem of inference 
rules involving variables. However, their solution was to design a different infer- 
ence system, called T, that allows inferences below the root. Our results solve 
this problem without requiring inferences below the root. The problem of Eager 
Variable Elimination was first presented in j^. 

The format of the paper is to first give some preliminary definitions. Then 
present our inference system. After a discussion of normal form, we present 
soundness results. In order to prove completeness, we first give a bottom-up 
method for deducing ground equations, then use that method to prove com- 
pleteness of our goal-directed method. After that we show how our completeness 
technique can be applied to Syntactic Theories to show completeness of a proce- 
dure similar to Syntactic Mutation. Finally, we conclude the paper. All missing 
proofs are in H21. 



2 Preliminaries 

We assume we are given a set of variables and a set of uninterpreted function 
symbols of various arities. An arity is a non-negative integer. Terms are defined 
recursively in the following way: each variable is a term, and if are 

terms, and / is of arity n > 0, then /(ti, • • • ,tn) is a term, and / is the symbol 
at the root of /(ti, ■ ■ ■ ,tn)- A term (or any object) without variables is called 
ground. We consider equations of the form s ~ t, where s and t are terms. Please 
note that throughout this paper these equations are considered to be oriented, 
so that s Ri t is a different equation that t ps s. Let if be a set of equations, and 
u Ri u be an equation, then we write E \= u fv v (or u =e v) if u v is true 
in any model containing E. If G is a set of equations, then E \= G means that 
if ^ e for all e in G. 
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A substitution is a mapping from the set of variables to the set of terms, 
such that it is almost everywhere the identity. We identify a substitution with 
its homomorphic extension. If 0 is a substitution then Dom{9) = {x \ x6 ^ x}. 
A substitution 9 is an E-unifier of an equation u ^ v ii E \= u9 ^ v9. 9 is an 
E-unifier of a set of equations G if 0 is an if-unifier of all equations in G. 

If cr and 9 are substitutions, then we write a <e 9[Var{G)] if there is a 
substitution p such that E ^ xap ~ x9 for all x appearing in G. If G is a set 
of equations, then a substitution 0 is a most general unifier of G, written 9 — 
mgu{G) if 0 is an if unifier of G, and for all E unifiers a oi G, 9 <e a[Var{G)]. 
A complete set of A-unifiers of G, is a set of if-unifiers 0 of G such that for all 
if-unifiers a of G, there is a 0 in 0 such that 9 <e a[Var{G)]. 

3 The Goal Directed Inference Rnles 

In this section, we will give a set of inference rules for finding a complete set of 
if-unifiers of a goal G, and in the following sections we prove that, for every goal 
G and substitution 9 such that E |= G9, G can be converted into a normal form 
(see Section^, which determines a substitution which is more general than 9. 
The inference rules decompose an equational proof by choosing a potential step 
in the proof and leaving what is remaining when that step is removed. 

We define two special kinds of equations appearing in the goal G. An equation 
of the form x ~ y where x and y are both variables is called a variable-variable 
equation. An equation a: ~ t appearing in G where x only appears once in G is 
called solved. 

As in Logic Programming, we can have a selection rule for goals. For each 
goal G, we don’t-care nondeterministically select an equation u ~ v from G, 
such that u Ks V is not a variable- variable equation and u ~ u is not solved. We 
say that u ~ u is selected in G. If there is no such equation u r:: u in the goal, 
then nothing is selected. We will prove that if nothing is selected, then the goal 
is in normal form and a most general- if unifier can be easily determined. 

There is a Decomposition rule. 

Decomposition 

{/(sir--,5„) R:; /(G,---,t„)}UG 

fsi ~ G , * * * , Sji ~ in} D G 

where /(si, • • • , s„) ~ /(G, “ ■ ,tn) is selected in the goal. 

This is just an application of the Congruence Axiom, in a goal-directed way. 
If / is of arity 0 (a constant) then this is a goal-directed application of Reflexivity. 

We additionally add a second inference rule that is applied when one side of 
an equation is a variable. 

Variable Decomposition 

f{ti,---,tn)}'JG 

{x Ki f{xi,-- ■ ,x„)} U ({a;i Ri fi, • • • , a;„ Ri U G)[a; >->• f{xi,-- • ,a;„)] 
where x is a variable, and x R:: /(G, ■ ■ • , tn) is selected in the goal. 
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This is similar to the Variable Elimination rule for syntactic equalities. It can 
be considered a gradual form of Variable Elimination, since it is done one step 
at a time. This rule is the same as the rule that is called Imitation in Trans and 
Root Imitation in BT. We have chosen our name to emphasize its relationship 
with the Decomposition rule. 

Now we add a rule called Mutate. We call it Mutate, because it is very similar 
to the inference rule Mutate that is used in the inference procedure for syntactic 
theories. Mutate is a kind of goal-directed application of Transitivity, but only 
transitivity steps involving equations from the theory. 

Mutate 

- ■■ ,Vn)}UG 

{u Ki S,ti - ■ ■ ,tn ^ Vn} U G 

where u ~ /(ui, • • • , Vn) is selected in the goal, and s ~ f{ti, ■ ,tn) & E.00 
This rule assumes that there is an equational proof of the goal equation at 
the root of the equation (see Section Q . If one of the equations in this proof is 
s Ki t then that breaks up the proof at the root into two separate parts. We have 
performed a Decomposition on one of the two equations that is created. Contrast 
this with the procedure for Syntactic Theories jH| which allows a Decomposition 
on both of the newly created equations. However, that procedure only works for 
Syntactic Theories, whereas our procedure is complete for any equational theory. 
The names of our inference rules are chosen to coincide with the names from . 
In Trans the Mutate rule is called Lazy Narrowing, and in BT it is called Root 
Rewriting. 

Next we give a Mutate rule for the case when one side of the equation from 
E is a variable. 

Variable Mutate 



f{vi, - ■■ ,Vn)}^G 

{u ^ s}[a; !-)■ f{xi, ■ ■ ■ ,Xn)] U {xi ^ Vi,- ■ ■ ,Xn ^ Vn} U G 

where s ~ x G E, a: is a variable, and u ~ f{v\, • • • , v„) is selected in the goal. 
This is called Application of a Trivial Clause in Trans, and it is a special case 
of Root Rewriting in BT. 

We will write G — >■ G' to indicate that G goes to G' by one application of 
an inference rule. Then — > is the reflexive, transitive closure of — >. 

When an inference is performed, we may eagerly reorient any new equations 
in the goal. The way they are reoriented is don’t-care nondeterministic, except 
that any equation of the form t ~ x, where t is not a variable and x is a 
variable, must be reoriented to x ~ t. This way there is never an equation with 
a nonvariable on the left hand side and a variable on the right hand side. 

^ For simplicity, we assume that E is closed under symmetry. 

^ s « f{ti , • • • , is actually a variant of an equation in E such that it has no variables 
in common with the goal. We assume this throughout the paper. 
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We will prove that the above inference rules solve a goal G by transforming it 
into normal forms representing a complete set of if-unifiers of G. There are two 
sources of non-determinism involved in the procedure defined by the inference 
rules. The first is “don’t-care” non-determinism in deciding which equation to 
select in the goal, and in deciding which way to orient equations with non- 
variable terms on both sides. The second is “don’t-know” non-determinism in 
deciding which rule to apply. Not all paths of inference steps will lead us to the 
normal form, and we do not know beforehand which ones do. 

4 Normal Form 

Notice that there are no inference rules that apply to an equation x y, where 
X and y are both variables@In fact, such an equation can never be selected. The 
reason is that so many Mutate and Variable Decomposition inferences could 
possibly apply to variable- variable pairs (as in BT) that we have designed the 
system to avoid them. That changes the usual definition of normal form, as in 
Standard Unification, and shows that inferences with variable-variable pairs are 
unnecessary. 

Let G be a goal of the form {xi ~ G, • • • , a;„ j/i ~ zi, • • • , 2/m ~ 
where all Xi, yi and Zi are variables, the ti are not variables, and for all i and j, 

1. Xi ^ Var{tj), 

2. Xi ^ yj and 

3 . Xi ^ Zj . 

Then G is said to be in normal form. Let ac be the substitution [xi i— > 
ti, - ■ ■ Xn I— t tn]- Let Tc be a most general (syntactic) unifier of 2/1 = zi, • • • , 2/m = 
Zm , with no new variables, such as what is calculated by a syntactic unification 
procedure. We know an mgu of only variable- variable equations must exist. Any 
such unifier effectively divides the variables into equivalence classes such that 
for each class E, there is some variable z in E such that yrc = z for all y € E. 
Then we write y = z. Note that for any A-unifier 9 of G, yO =e y9. Finally, 
define Oq to be the substitution acre- 

Proposition 1. A goal with nothing selected is in normal form. 

Proof. Let G be a goal with nothing selected. Then all equations in G have a 
variable on the left hand side. So G is of the form X\ ~ ti, • • • , a;„ ~ tn, 2/i ~ 
zi,- " jUm ~ Zm- Since nothing is selected, each equation Xi ~ t\ must be solved. 
So each Xi appears only once in G. Therefore the three conditions of normal form 
are satisfied. □ 

Now we will prove that the substitution represented by a goal in normal form 
is a most general A-unifier of that goal. 



This is similar to the flex-flex pairs for higher order unification in 0. 
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Lemma 1. Let G be a set of equations in normal form. Then 9a is a most 
general E-unifier of G. 

Proof. Let G be the goal {xi ^ ti, ■ ■ ■ ,Xn ^ tn,yi ^ Zi, ■ ■ ■ ,ym ~ Zm}, such that 
for all i and j, Xi ^ tj, Xi ^ yj and Xi ^ Zj. Let (tq = [xi ti, - ■ ■ ,Xn tn]. 
Let TQ = mgu{yi = zi, - ■ ■ ,ym = Zm). Let Oq = o'aTc. We will prove that 9c is 
a most general E unifier of G. 

Let i and j be integers such that 1 < i < n and 1 < j < n. First we need to 
show that 9q is a unifier of G, i.e. that Xi9a = ti9c and yj9a = Zj9c. In other 
words, prove that Xiacra = UacTa and yjaaTa = ZjaaTa- Since ti, yj and Zj 
are not in the domain of a, this is equivalent to tiTa = tiTc and yjTQ = ZjTa, 
which is trivially true, since tq is mgu of {j/i « zi, • • • , ym ^ Zm}. 

Next we need to show that 9a is more general than all other unifiers of G. So 
let 9 be an if-unifier of G. In other words, Xi9 =e ti9 and yj9 =e Zj9. We need 
to show that 9g <e 9\Var{G)\. In particular, we will show that G9a9 =e G9. 

Then Xi9a9 = XiaGTa9 = tiTa9 =e ti9 =e Xi9. The only step that needs 
justification is the fact that tiTa9 =e U9. This can be verified by examining 
the variables of ti. So let w be a variable in ti. If w ^ Dom{Ta) then obviously 
wtg 9 = w9. If ru G Dom{Ta) then w is some yu. Note that ykTa9 = yk9 =e yu9. 
So tiTc9 =e U9. 

Also, yj9a9 = yjCrGTG9 = yjTaS = yj9 =e yj9. Similarly for Zj. □ 



5 An Example 

Here is an example of the procedure. (The selected equations are underlined.) 

Example 1 . Let A = Aq = {ffx « gfx}, G = Go = { fgfy » ggfz }. 

By rule Mutate applied to Go we have 
Gi = { fgfy - ffxi , fxi « gfz}. 

After Decomposition, 

G2 = { gfy - fxi , fxi « gfz}. 

After Mutate, 

G3 = { gfy - gfx2 , xi « /x2, fxi « gfz] 

After Decomposition is used 2 times on G 3 , 

G4 = {y « X2, xi ^ fx2 ,fxi « gfz}. 

Variable Decomposition: 



Go = {y- X2,xi « 
Mutate: 


fX3,X3 f 


a X2,ffX3 


« gfz}. 


Gq = {y- X2,xi « 
2x Decomposition: 


fX3,X3 f 


a X2,ffX3 


- ffX 4 ,fX 4 « fz} 


Gr = {y- X2,xi « 
Decomposition: 


fxs.xo s 


S X2,X3 « 


X 4 JX 4 « fz}. 


Gs = {y~ X2,xi « 


fX3,X3 f 


S X2,X3 « 


X 4 , X 4 — z}. 



The extended 9' that unifies the goal Go is equal to: [xi !->■ fxs] [y 1 -^ z,xo 1 -^ 
z,X 2 z,X 4 1 -^ z]. 9' is equivalent on the variables of G to 0 equal to: [y z\. 
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6 Soundness 

Theorem 1. The above procedure is sound, i.e. if G' G and G is in normal 
form, then E \= G'Oq- 



7 A Bottom Up Inference System 



In order to prove the completeness of this procedure, we first define a bottom-up 
equational proof using Congruence and Equation Application rules. We prove 
that this equational proof is equivalent to the usual definition of equational 
proof for ground terms, which involves Reflexivity, Symmetry, Transitivity and 
Congruence. 



Congruence: 



Si ~ * * * Sn ~ tn 

~ f {tlj ' ' ' jtn) 



if S 



t 



Equation Application: 



U Ki s t K. V 



is a ground instance of an equation in E. 



We define E \- u ~ v ii there is a proof of u « u using the Congruence and 
Equation Application rules. If tt is a proof, then |7 t| is the number of steps in the 
proof. |u « v\e is the number of steps in the shortest proof of m « u. 

We need to prove that {u « u | if h u « u} is closed under Reflexivity, 
Symmetry and Transitivity. First we prove Reflexivity. 

Lemma 2. Let E he an equational theory. Then if h u « u for all ground u. 
Next we prove closure under symmetry. 

Lemma 3. Let E he an equational theory such that if h u « u and |m « v\e = n. 
Then if h u « u, and \v « u\e = n. 

Next we show closure under Transitivity. 

Lemma 4. Let E he an equational theory such that if h s « i and if h i « u. 
Suppose that |s « t\E = rn and \t « u\e = ri. Then if h s « m, and |s « u\e < 
m -I- n. 

Closure under Congruence is trivial. Now we put these lemmas together to 
show that anything true under the semantic definition of Equality is also true 
under the syntactic definition given here. 

Theorem 2. // if |= u « u, then E \- u fv v, for all ground u and v. 

We can restrict our proofs to only certain kinds of proofs. In particular, if the 
root step of a proof tree is an Equation Application, then we can show there is a 
proof such that the proof step of the right child is not an Equation Application. 
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Lemma 5. Let tt be a proof of u v in E, whieh is derived by Equation Ap- 
plieation, and whose right child is also derived by Equation Application. Then 
there is a proof tt' of u ~ v in E such that the root of tt' is Equation Application 
but the right child is derived by Congruence, and |7t'| = |7t|. 

Proof. Let tt be a proof of u ~ r; in if such that the step at the top is Equation 
Application, and the step at the right child is also Equation Application. We 
will show that there is another proof tt' of u ~ u in if such that \tt'\ = |7t|, and 
the size of the right subtree of tt' is smaller than the size of the right subtree of 
TT. So this proof is an induction on the size of the right subtree of the proof. 

Suppose M Ri u is at the root of tt and u tv s labels the left child rii. Suppose 
the right child U 2 is labeled with t k, v. Further suppose that the left child of 
712 is labeled with t k, w\ and the right child of ri 2 is labeled with W 2 ~ v. Then 
s K. t and wi ~ W 2 must be ground instances of members of E. 



7Ti 7T2 7T3 



: t TV Wi W2 TH V 

ni: U TV S ri2: t TV V 

U TV V 



Eq. App. 
Eq. App. 



Then we can let tt' be the proof whose root is labeled with u tv v, whose left 
child 773 is labeled with u tv w\. Let the left child of 773 be labeled with u tv s 
and the right child of 773 be labeled with t tv w\. Also let the right child of the 
root 774 be labeled with W 2 tv v. 



TTi TT2 7T3 



Eq. App. 
Eq. App. 




ns: U ^ Wi nr. W 2 ^ V 

U ^ V 



By induction, tt' is a proof o^ u ^ v o^ the same size as tt. 



□ 



8 Completeness of the Goal-Directed Inference System 

Now we finally get to the main theorem of this paper, which is the completeness 
of the inference rules given in section 0 But first we need to define a measure 
on the equations in the goal. 

Definition 1. Let E be an equational theory and G be a goal. Let 9 be a substi- 
tution such that E ^ G9. We will define a measure yi, parameterized by 9 and 
G. Define p,{G, 9) as the multiset {\u9 tv v9\e \ u tv v is an unsolved equation in 
G}. 
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The intension of the definition is that the measure of an equation in a goal is 
the number of steps it takes to prove that equation. However, solved equations 
are ignored. 

Now, finally, the completeness theorem: 

Theorem 3. Suppose that E is an equational theory, G is a set of goal equa- 
tions, and 0 is a ground substitution. If E \= G9 then there exists a goal H in 
normal form such that G — ^ H and 6 h <e 0[Var{G)]. 

Proof. Let G be a set of goal equations, and 9 a ground substitution such that 
E ^ G9. Let ix{G, 9) = M. We will prove by induction on M that there exists a 
goal H such that G H and 9h <e 9[Var{G)]. 

If nothing is selected in G, then G must be in normal form, by Proposition 
1. By LemmaQ 9g is the most general if-unifier of G, so 9g <e 9[Var{G)]. 

If some equation is selected in G, we will prove that there is a goal G' and a 
substitution 9' such that G — >■ G', 9' <e 9\V ar{G)], and p,{G' ,9') < p,{G,9). 

So assume that some equation u ~ v is, selected in G. Then G is of the form 
{m ~ t:} U Gi. We assume that v is not a variable, because any term- variable 
equation t k, x is immediately reoriented to a ; ~ t . By Lemma 13 \v9 « u9\e = 
\u9 ~ v9\e. Also, according to our selection rule, a variable-variable equation 
is never selected. Since v is not a variable, it is in the form f{vi,---,Vn). Let 
\u9 ~ v9\e = rn. 

Consider the rule used at the root of the smallest proof tree that E\- u9 ~ v9. 
This was either an application of Congruence or Equation Application. 

Case 1 : Suppose the rule at the root of the proof tree oi E \- u9 ~ v9 is an 
Equation Application. Then there exists an extension 9' of 9 and a ground 
instance s9' ~ t9' of an equation s ~ t in E, such that E h u9' k, s9' and 
E\- t9' Ki v9' . Let \u9' ~ s9'\e = p. Let \t9' ~ v9'\e = q. Then m = p-\-q-\-l. 
We now consider two subcases, depending on whether or not f is a variable. 
Case lA: Suppose that t is not a variable. Then, we can assume that the rule 
at the root of the proof tree of if h t9' ~ v9' , is Congruence. Otherwise, 
by Lemma 13 it could be converted into one, without making the proof any 
longer. So then t is of the form /(ti, • • • , tn), and the previous nodes of the 
proof tree are labeled with G0' Ri v\9' ,■ ■ ■ ,tn9' ~ Vnd' . And, for each i, 
\ti9' Ri v^9'\e = qi such that 1 -I- Ei<i<nqi = q- 

Therefore, there is an application of Mutate that can be applied to u ^ v, 
resulting in the new goal G' = {u r; s,ti r; tii, • • • r U Gi. Then 
\u9' R s9'\e = P, and \ti9' r Vi9'\E = qi for all i, so fj.{G'9') < fj,{G,9). By 
the induction assumption there is an H such that G' — ^ H with 9h <e 
9'[Var{G')]. This implies that G H. Also, 9h <e 0'[Var{G)], since the 
variables of G are a subset of the variables of G'. Since G9' = G9, we know 
that 9h <e 9\Var{G)]. 

Case IB: Suppose that f is a variable. Then, by LemmaO we can assume that 
the rule at the root of the proof tree of if h t9' r v9' is Congruence. So then 
t9' is of the form /(ti, • • • , tn), and the previous nodes of the proof tree are 
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labeled with t\ ~ v\6' , ■ ■ ■ ^ VnO' . And, for each i, \ti Ri Vi6'\E = qi such 

that 1 -I- = q- 

Therefore, there is an application of Variable Mutate that can be applied 
to u « u, resulting in the new goal G" = {u ~ s[t !-->■ ■■ ,x„)],a;i ~ 

vi, - ■ ■ ,Xn ~ Vn} U Gi}. We will extend O' so that XiO' = U for all i. Then 
\u9' ~ s9'\e = P, and \xi9' m ViO'\E = qi for all i, so p{G'9') < p{G,9). By 
the induction assumption there is an H such that G' — ^ H with 9^ <e 
9'[Var{G')]. This implies that G H. Also, 9h <e 9'[Var{G)], since the 
variables of G are a subset of the variables of G'. Since G9' = G9, we know 
that 9h <e 9[Var{G)]. 

Case 2: Now suppose that the rule at the root of the proof tree of E \- u9 Ki v9 
is an application of Congruence. There are two cases here: m is a variable or 
u is not a variable. 

Case 2A: First we will consider the case where u is not a variable. Then u = 
/(ui, • • • , u„), V = f{vi,---,Vn) and E h Ui9 ~ Vi9 for all i. There is an 
application of Decomposition that can be applied to it ~ w, resulting in the 
new goal G' = {iti ~ ui, • • • ,it„ ~ u„} U Gi. Then \ui9 ~ ViO\E < \u9 ~ v9\ 
for all i, so p{G',9) < p{G,9). By the induction assumption there is an H 
such that G' H with 9h <e 9[Var{G')]. This implies that G —A H and 
9h <E0[Var{G)]. 

Case 2B: Now we consider the final case, where u is a variable and the rule at 
the root of the proof tree of E h u9 v9 is an application of Congruence. 
Let u9 = f{ui, ■ ■ ■ ,Un)- Then, for each i, A h iti ~ Vi9, and \ui ~ ViO\E < 
\u9 ~ v9\e- There is an application of Variable Decomposition that can be 
applied to u ~ u, resulting in the new goal G' = {u ~ f{xi,---,Xn)} U 
({cci ~ vi, - ■ ■ ,Xn ~ u„} U Gi)[u I— >■ f{xi, • • • , Xn)]- Let 9' be the substitution 
9 fi[x\ I— >■ ui, • • • , a;„ i— t u„]. Then u ~ f{x\, • • • , x„) is solved in G'. Also 
\xi9' ~ ViO'\E < \u9 ~ v9\e for all i. Therefore p{G,9) < p{G',9'). By 
the induction assumption there is an H such that G' —A H with 9^ <e 
9'[Var{G')]. This implies that G H. Also, 9h <e 9'[Var{G)], since the 
variables of G are a subset of the variables of G'. Since G9' = G9, we know 
that 9h <e 9[Var{G)]. 

□ 

The fact that we required 9 to be ground in the theorem does not limit our 
results. This implies that any substitution will work 

Corollary 1. Suppose that E is an equational theory, G is a set of goal equa- 
tions, and 9 is any substitution. If E \= G9 then there exists a goal H such that 
G^H and 9 h <e 9[Var{G)]. 

Proof. Let 9' be a skolemized version of 9, i.e., 9' is the same as 9 except that 
every variable in the range of 9 is replaced by a new constant. Then 9' is ground, 
so by TheoremElthere exists a goal H such that G — >■ H and 9h <e 9'\Var{G)]. 
Then 9^ cannot contain any of the new constants, so 9^ <e 9[Var{G)]. □ 
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9 ^^-Unification for Syntactic Theories 

In this section we will show how we can restrict our inference rules further to get 
a set of inference rules that resembles the Syntactic Mutation rules of Kirchner. 
Then we prove that that set of inference rules is complete for syntactic theories. 

The definition of a syntactic theory is in terms of equational proofs. The 
definition of a proof is as follows. 

Definition 2. An equational proof of u « v from E is a sequence mq « ui « 
« • • • « Un, for n >0 such that Uq = u, Un = v and for all i > 0, Ui = Ui[s9] 
and Mj+i = Ui[t9] for some s ^ t & E and some substitution 9. 

Now we give Kirchner’s definition of syntactic theory. 

Definition 3. An equational theory E is resolvent if every equation u pc v with 
E \= u V has an equational proof such that there is at most one step at the 
root. A theory is syntactic if it has an equivalent finite resolvent presentation. 

From now on, when we discuss a Syntactic Theory E, we will assume that E 
is the resolvent presentation of that theory. 

In this paper, we are considering bottom-up proofs instead of equational 
replacement proofs. We will call a bottom-up proof resolvent if whenever an 
equation appears as a result of Equation Application, then its left and right 
children must have appeared as a result of an application of Congruence at 
the root. We will call E bottom-up resolvent if every ground equation u ^ v 
implied by E has a bottom-up resolvent proof. Now we show that the definition 
of resolvent for equational proofs is equivalent to the definition of resolvent for 
bottom-up proofs. 

Theorem 4. E is resolvent if and only if E is bottom-up resolvent. 

Proof. We need to show how to transform a resolvent equational proof into a 
resolvent bottom-up proof and vice versa. 

Case 1: First consider transforming a resolvent equational proof into a resolvent 
bottom-up proof. We will prove this can be done by induction on the the 
lexicographic combination of the number of steps in the equational proof 
and the number of symbols appearing in the equation. 

Case lA: Suppose u fa v has an equational proof with no steps at the root. 
Then t6 « u is of the form /(mi, •••,«„) « /(ui, • • • , r;„), and there are 
equational proofs of Ui « Vi for all i. Since each equation Ui « Vi has fewer 
symbols than u fa v and does not have a longer proof, then, by the induction 
argument there is a resolvent bottom-up proof of each Ui fa Vi, and by adding 
one more congruence step to all the Ui fa Vi, we get a resolvent bottom-up 
proof of u fa v. 

Case IB: Now suppose u fa v has an equational proof with one step at the 
root. Then there is a ground instance s fa t of something in E such that the 
proof of u « f is a proof of u fa s with no steps at the top, followed by a 
replacement of s with t, followed by a proof oftfav with no steps at the root. 
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By induction, each child in the proof of u « s has a resolvent bottom-up 
proof. Therefore u « s has a resolvent bottom-up proof with a Congruence 
step at the root. Similarly, t ^ v has a resolvent bottom-up proof with a 
Congruence step at the root. If we apply Equation Application to those two 
proofs, we get a bottom-up resolvent proof of u « u. 

Case 2: Now we will transform a resolvent bottom-up proof of u « v to an 
equational proof of u « u, by induction on |u « uj^;. 

Case 2A: Suppose u ^ v has a bottom-up resolvent proof with an application 
of Congruence at the root. Then u « u is of the form /(ui,---,u„) « 
and there are bottom-up resolvent proofs of Ui « Vi for all 
i. Since each equational proof of Ui « Vi is shorter than the proof of m « u, 
then, by the induction argument there is a resolvent equational proof of each 
Ui « Vi, and they can be combined to give a resolvent equational proof of 
u ^ V. 

Case 2B: Now suppose u~ v has a resolvent bottom-up proof with one Equa- 
tion Application step at the root. Then there is some s ~ t ixi E such that 
the proof of u « u is a proof of u « s with a Congruence step at the root, 
and a proof of t « u with a Congruence step at the root, then an Equation 
Application using the equation s ~ t from E. By induction, the correspond- 
ing equalities of subterms of u « s have resolvent equational proofs. So u « s 
has a resolvent equational proof with no steps at the root. Similarly, t k, v 
also has a resolvent equational proof with no steps at the root. So u « u has 
a resolvent equational proof with one step at the root. 

□ 

Now we give the inference rules for solving E-unification problems in Syntac- 
tic Theories. The rules for Decomposition and Variable Decomposition remain 
the same, but Mutate becomes more restrictive. We replace Mutate and Variable 
Mutate with one rule that covers several cases. 



Mutate 



{m « u} U G 

{Dec{u « s), Deciy « t)} U G 



where u « u is selected in the goal, s ^ t G E, v is not a variable, if both u 
and s are not variables then they have the same root symbol, and if t is not a 
variable then v and t have the same root symbol. We also introduce a function 
Dec, which when applied to an equation indicates that the equation should be 
decomposed eagerly according to the following rules: 

{Dec{f{ui, •••,-»„) ft! /(si, • • • , s„))} U G 
~ Si , * * * , Uji ~ U G 



{T>ec(a: » /(si, • • • , s„)} U G 

{X « f{xi, ■■■ , Xn)} U G[X f{xi,- ■ ■ , X„)] U {Xi « Si, • • • , X„ « S„} 



where the Xi are fresh variables. 
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{Dec{x ~ y)} U G 
{x^y}UG 

{£>ec(/(si, • • • , S„) R:! x)} U G 

G[x f{xi, ■■■ , Xn)] U {xi Ri Si, • • • , Ri S„} 
where the Xi are fresh variables. 

Now we prove a completeness theorem for this new set of inference rules, 
which is Decomposition, Variable Decomposition, and the Mutate rule given 
above. 

Theorem 5. Suppose that E is a resolvent presentation of an equational theory, 
G is a set of goal equations, and 9 is a ground substitution. If E \= GO then there 
exists a goal H in normal form such that G H and 9jj <e 9\Var{G)\. 

Proof. The proof is the same as the proof of Theorem 01 except for Case 1. 
In this case, we can show that one of the forms of the Mutate rules from this 
section is applicable. Here, instead of using Lemma 0 to say that an Equation 
Application must have a Congruence as a right child, we instead use the definition 
of bottom-up resolvent to say that an Equation Application has a Congruence 
as both children. The full proof is in unj. □ 

10 Conclusion 

We have given a new goal-directed inference system for E-unification. We are 
interested in goal-directed A-unification for two reasons. One is that many other 
inferences systems for which E-unification would be useful are goal directed, and 
so a goal-directed inference system will be easier to combine with other inference 
systems. The second reason is that we believe this particular inference system is 
such that we can use it to find some decidable classes of equational theories for 
E-unification and analyze their complexity. We have already made progress in 
this direction in nmni. 

Our inference system is an improvement over the inference systems BT of 0 
and Trans of jS| for Equational Unification. There are two important differences 
between our inference system an those other two. The first is that those inference 
systems require the Variable Elimination rule. This blows up the search space, 
because, for an equation x k, t, both Variable Elimination and (Root) Imitation 
will be applicable. We do not require Variable Elimination. The second difference 
is that both of those inference systems require an inference with a variable in 
the goal. In BT, Root Rewriting inferences are performed on variable-variable 
pairs. This blows up the search space, because everything unifies with a variable. 
Similarly, in BT, Root Imitation inferences are performed on variable-variable 
pairs. That blows up the search space because it must be attempted for every 
function symbol and constant. In Trans, there is a rule called Paramodulation 
at Variable Occurence. This is like a Mutate (Lazy Paramodulation) inference 
applied to a variable a; in a goal equation x k, t. Again, every equation will 
unify with x, so the search space will blow up. Gallier and Snyder recognize the 
above-mentioned problems of BT . There solution is to create another inference 
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system called T, but that one is different because Root Rewriting inferences are 
now allowed at non-root positions. 

The inference system we have given is similar to the Syntactic Mutation 
inference system of H . The difference is that our inference system can be applied 
to all equational theories, not just Syntactic Theories as in their case. Also, 
we show how our results are easily adapted to give an inference similar to the 
Syntactic Mutation rules of jOj. While the rules in jHI have not been proved 
complete, we prove that ours are complete. 
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Abstract. The unification problem for term rewriting systems(TRSs) 
is the problem of deciding, for a given TRS R and two terms M and N, 
whether there exists a substitution 6 such that M9 and N9 are congruent 
modulo R (i.e., M9 N9). In this paper, the unification problem for 
confluent right-ground TRSs is shown to be decidable. To show this, the 
notion of minimal terms is introduced and a new unification algorithm 
of obtaining a substitution whose range is in minimal terms is proposed. 
Our result extends the decidability of unification for canonical (i.e., con- 
fluent and terminating) right-ground TRSs given by Hullot (1980) in 
the sense that the termination condition can be omitted. It is also ex- 
emplified that Hullot’s narrowing technique does not work in this case. 
Our result is compared with the undecidability of the word (and also 
unification) problem for terminating right-ground TRSs. 



1 Introduction 

The unification problem for TRSs is the problem of deciding, for a TRS R and 
two terms M and N, whether M and N are unifiable modulo R, that is, whether 
there exists a substitution 9 (called an R-unifier) such that M9 and N6 are con- 
gruent modulo R (i.e., M6 tAjj N6). The unification problem is undecidable in 
general and even if we restrict to subclasses of TRSs, a lot of negative results have 
been shown, e.g., undecidability for canonical (i.e., terminating and confluent) 
TRSs [3] (having the decidable word problem), terminating and right-ground 
TRSs (since the word problem for this class is undecidable [10] and the word 
problem, M N, is a special case of the unification problem). On the other 
hand, several positive results have been obtained, e.g., unification is decidable 
for ground TRSs [2], left-linear and right-ground TRSs [8,2], canonical right- 
ground TRSs [4], shallow TRSs [1], linear standard TRSs [8] and semi-linear 
TRSs [5]. The narrowing (or paramodulation) technique [4,8] is strong and use- 
ful for showing the decidability of unification, in fact used for obtaining many 
of the above decidability results. But, this technique is difficult to apply to non- 
terminating TRSs. Thus, new techniques ensuring the decidability of unification 
for nonterminating TRSs are needed. This is a motivation of investigation of this 
paper. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 246-260, 2001. 
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In this paper, we consider the unification problem for confluent right-ground 
TRSs which may be nonterminating. This problem is a natural problem, since 
for extending the decidability of unification for canonical right-ground TRSs, 
we have two choices: one is to omit the termination condition and the other 
to omit the confluence or right-ground condition, but the latter is impossible 
by the undecidability results for terminating right-ground TRSs and canonical 
TRSs. In this paper, we show that the termination condition can be omitted, i.e., 
unification is decidable for confluent right-ground TRSs. This result can be also 
regarded as a solution to one of the open problems posed by Nieuwenhuis [8] . 

We first see that the narrowing technique does not work in this case. Let 
i?i = {eq{x,x) — >■ t,eq(riot{x),x) f,t ^ not{f),f — >■ not{t)} where only x is 

a variable. Note that i?i is nonterminating, confluent [11] and right-ground. Let 
M = eq{y , not(y)) and N = y where y is a variable. In this case, the narrowing 
technique does not work: since any nonvariable subterm of M(pv N) and the left- 
hand-side of every rule are not 0-unifiable, this technique can not decide whether 
M and N are i?i-unifiable. (Note that a substitution 9 satisfying y9 = f is an 
i?i-unifier of M and N.) 

So, we use a more general technique analogous to lazy narrowing [6,7] and 
RU [3, p. 284] each of which consists of more primitive operations which can 
simulate narrowing. But, the most crucial point is to transform such a technique 
into a new decision procedure which can decide whether a problem instance is 
unifiable or not. Up to our knowledge, such attempts were very few so far. To 
obtain our result, we introduce the notion of minimal terms for nonterminating 
right-ground TRSs, which play a role similar to irreducible (i.e., normal form) 
terms in terminating TRSs. Then, we construct a unification algorithm which 
takes as input a confluent right-ground TRS R and two terms M and N and 
produces an R-unifier d of M and N such that x9 is minimal for each variable 
a; iff M and N are unfiable modulo R. Such 9 whose range is minimal is called 
locally minimal and a key idea for ensuring the correctness of our algorithm. 

2 Preliminaries 

We assume that the reader is familiar with standard definitions of rewrite systems 
(see [3]) and we just recall here the main definitions and notations used in the 
paper. 

We use e to denote the empty string and 0 to denote the empty set. For a 
set S, let Power(S') = {S' \ S' C S}, i.e., the set of all the subsets of S, and let 
[S'! be the cardinality of S. Let X be a set of variables, let F be a finite set of 
operation symbols graded by an arity function arity :F— >-{0,1,2, •••}, and let 
T be the set of terms constructed from X and F. A term M is ground if M has 
no variable. Let G be the set of ground terms. For a term M, we use 0{M) to 
denote the set of positions of M, M|„ to denote the subterm of M at position u, 
and M[N]u to denote the term obtained from M by replacing the subterm M|„ 
by term N. For a sequence {u\, ■ ■ ■ ,u„) of pairwise disjoint positions and terms 

, • • • , L«„, we use M[Lm , • • • , to denote the term obtained from 




248 M. Oyamaguchi and Y. Ohta 



M by replacing each subterm M\ui by {I < i < n). Let Ox{M) be the set 
of positions of variable x € X va M, i.e., Ox{M) = {m g 0{M) \ M|„ = x}. 
Let Ox{M) = \j^^^Ox{M) and Of{M) = 0{M) \ Ox{M). Let V{M) be 
the set of variables occurring in M. We use \M\ to denote the size of M, i.e., 
the number of symbols in M . For a position u, we use luj to denote the length 
of u. The root symbol of M is denoted by root(M). M|„ is a leaf symbol of 
M if |M|„| = 1. Let M\N/x] be the term obtained from M by replacing all 
occurrences of x by N . This notation is extended to sets of terms: for F CT, let 
F[N/x] = {M[N/x] \M €F}. Let C>„v(M) = {u G 0{M) \ Wv G Ox(M). n|u}. 

Let 7 : Mi ^ M 2 ■ ■ ■ or M\ M 2 ■ ■ ■ Mn be a rewrite 

sequence. Then, TZ{j) = {wi, • • • , u„_i}, i.e., the set of the redex positions of 
7 . If £ ^ ^( 7)5 then 7 is called £-invariant (or £-inv). A position m in a set of 
positions U is minimal iiv-fiu for any v G U . Let Min([/) be the set of minimal 
positions of U. 

Definition 1 . We use M k, N to denote a pair of terms M and N . M k, N is 
unifiable modulo a TRS R (or simply i?-unifiable^ if there exists a substitution 
0 and a rewrite sequence 7 such that 7 : M9 gg* N9. Such 9 and 7 are called 
an i?-unifier and a proof of M ps N , respectively. This notion is extended to 
sets of term pairs: for F C T x T, 9 is an R-unifier of F if 9 is an R-unifier 
of every pair Mi « Ni of F. In this case, F is R-unifiable. As a special case 
of R-unifiability, M Ki N is tb-unifiable if there exists a substitution 0 such that 
M9 = N9, i.e., tb-unifiability coincides with usual unifiability. 



2.1 Standard Right-Ground TRS and Minimal Term 

Definition 2. A right-ground TRS R is said to be standard if\a\ = 1 or\(3\ = 1 
for any rule a — >■ /3 G i?. 

Let R = {a\ — >■ /3i, • • • , a„ -G /?„} be a right-ground TRS. The corresponding 
standard TRS R' is constructed as follows. Let ci, • • • , c„ G F be new pairwise 
distinct constants which do not appear in R. Then, R' = {oj -G Ci,Ci ^ (3i \ 1 < 
i < n} is standard. 

We can show that R is confluent iff R' is confluent, and for any terms M, N 
which do not contain ci, • • • , c„ and any substitution 9, M9 fa N9 iff M9 Iri 
N9. The proof is straightforward, so omitted. Thus, the i?-unification problem 
for confluent right-ground TRS R reduces to that for the above corresponding 
standard R' . Hence, without loss of generality we can assume that a conffuent 
right-ground TRS R is standard. Henceforth, we consider a fixed right-ground 
TRS R which is confluent and standard. 

Definition 3. Let H{M) = Max{\u\ \ u G 0{M)}, i.e., the height of M . We 
define Hm{M) as {H{M\u) \ u G 0{M)}rn- Here, we use {• • -jm to denote a 
multiset and below we use U to denote the multiset union. Let <C be the multiset 
extension of < and let ^ &e <C U =. For a term M, let L{M) = {N \ N gg* M} 
and Lmin{M) = {N G L{M) \ MN' G L{M). H^{N)^Hmi,N')}. A term M is 
minimal iff M G Lmin{M). 
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Note that Lmin{M) is well-defined: Lmin{M) ^ 0. We have the following 
lemmata. 

Lemma 1. Let M he a minimal term and let ^ : M o* N. Then for any 
u G 7 ^( 7 ) there exists v G 0{M) such that H{M\y) = 0 and v < u. (That is, 
only leaf symbols of M are rewritten in 7.^ 

Proof. Note that since R is standard, L 4^ L' implies that H{L) = 0 or H{L') = 
0 holds. Thus, minimality of M ensures this property, since there exists no 
S : M|„ GG* L satisfying that H{M\y) > 0 and H{L) = 0. □ 

Lemma 2. For any term M, Lmin{M) is finite and computable. 

Proof. We prove this lemma by induction on H{M). First suppose that H{M) = 
0. Obviously, M G Lmin{M). For a term N yf M, if G Lmin{M), then 
H{N) = 0 and N gg"*" M. Since R is right-ground and confluent, we have 
N f M and N, M G F. Since joinability of right-ground TRSs is decidable [9] 
and F is finite, Lmin{M) is finite and computable. 

Next suppose that Fl{M) > 0. We first check whether there exists a rule 
a — >■ /3 G i? such that if | a |= 1, then M gg* a, otherwise M GG* /3. This is also 
decidable by similar arguments as above. If so, then a G and | a |= 1 

or /? G Ljnin{M) and | /3 |= 1, since R is standard. Thus, Ljnin{M) = Lmin{o:) 
or Lfnin{M) = LminiP)- It follows that Lmin{M) is finite and computable. Oth- 
erwise, M GG* N implies that M gg* N is e-invariant for any term N, i.e., 
root(M) = root(iV). Let / = root(M) and let k = arity(/). Since is 

finite and computable for all 1 < z < fc according to the induction hypothesis, 
so is L^in{M) = { /(IVi, • ■■,Nk) \ NiG L™„(M|,) for 1 < z < k}. □ 

2.2 Locally Minimal Unifier and New Pair of Terms 

Definition 4. Let F Q T x T. A substitution 6 is a locally minimal R-unifier 
of F if 9 is an R-unifier of F and x0 is minimal for any x G Dom{9). 

In this paper, we give a new unification algorithm which takes a pair of terms 
M K, N as input and produces a locally minimal unifier 0 of M « N iff M « N is 
i?-unifiable. For this purpose, we need pairs of terms having new types. M xijj N 
and M «vf N , which are respectively called term pairs with type X[/ and with 
type vf, are introduced where M,N G T and U C 0{N) is a set of pairwise 
disjoint positions. Let Eq = {M m N,M N, M «vf N, fail | M, N G T and 
U C 0{N) is a set of pairwise disjoint positions}. Here, fail is introduced as a 
special symbol and we assume that there exists no i?-unifier of fail. i?-unifiers of 
these new pairs are required to satisfy additional conditions derived from these 
types: 

Definition 5. A substitution 6 is an R-unifier of M Xj/ N if 9 is an R-unifier of 
M Ki N and the following condition holds: ifU = % then M9 — >■* N9, otherwise 
there exists L = • • • , for some 1 < z < n, where U = 

{zzi, • • • , Un}, such that M9 - G * L and for any Ui G U, gg* 

A substitution 9 is an i?-unifier of M «vf N if 9 is an R-unifier of M Ki N 
and there exists 7 : M6 gg* N9 such that Ox{N) is a frontier in 7, i.e., u\v or 
V < u holds for any u G R-ij) and v G Ox{N). 
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Note that if [/ = {e}, then 9 is an i?-unifier of M ><{e} N iS 6 is an i?-unifier 
of M « by definitions. So, M ><{£} N is replaced by M « TV and excluded 
from Eq. Also, pairs of form M x, where M G T,x G X,U C 0{x), are not 
used and excluded from Eq. 

Example 1. Let R\ be the TRS shown in Section 1. 

1. eq{not{x),x) « not{x) is i?i-unifiable, since any substitution 6 satisfying 
x6 = t is an i?i-unifier: eq{not{t),t) —>■/—>■ not{t). 

2. eq{f, not{not{t))) ><{ 1 ^ 21 } not{y)) is i?i-unifiable, since any substitution 

21 

9 satisfying y9 = f is an i?i-unifier: eq{f , not{not{t))) G- eq{f , not{f)) . 

3. eq{t,not{t)) «vf Gq{not{f),y) is i?i-unifiable, since any substitution 9 

1 2 

satisfying y9 = f is an i?i-unifier: eq{t,not{t)) -G eq{not{f),not{t)) G- 
eq{not{f),f). 

3 i?-Unification Algorithm 

We are ready to give our i?-unification algorithm Our algorithm consists of a 
set of primitive operations analogous to those of Lazy Narrowing [6,7] and [3]. 
Each primitive operation takes a finite set of pairs E Q Eq and produces some 
r' C Eq, denoted by E E' . This operation is called a transformation. Such a 
transformation is made nondeterministically: E Ei,E ^ 4 , E 2 , - ■ ■ ,E Ek 
are allowed for some Ti, • • • , C Eq. Let be the reflexive transitive closure 
of ^ 4 ,. Our algorithm starts from Eq = {Mq « Nq} , where Mq,Nq G T, and 
makes primitive transformations repeatedly. We will prove that there exists a 
sequence Eq E such that E is 0-unifiable iff Eq is i?-unifiable. 

Our algorithm is divided into three stages. 

3.1 Stage I 

The transformation of Stage I takes as input a finite subset E of Eq and 
has a finite number of nondeterministic choices E =^<fi Ei,---,E =^ 4 ,^ Ek for 
some El, ■ ■ ■ ,Ek C Eq, i.e., is finite-branching. In this case, we write ^i{E) = 
{Ti, • • • , Ek} by regarding <l>i as a function. 

We begin with the initial E = {Mq « Nq} and repeatedly apply the trans- 
formation <Pi until the current E satisfies the stop condition of Stage I defined 
below. We consider all possibilities in order to ensure the correctness of the al- 
gorithm. If E satisfies this condition, then E becomes an input of the next stage. 
The stop condition of Stage I is as follows. 

E n {fail, M «vf iV I M, IV G T} yf 0 or r C A X A 

To describe the transformations used in Stage I, we need the following aux- 
iliary function: 

decompose (M, A, [/) = |M|j i<u/i -^1* I I <i < k and Uji^ {e}} 

U |M|i « N\i I \ <i <k and U/i = {e}} 

where k = arity(root(M)) and U/i = {u\i-uG U}. (Note that 0/i = 0.) 




The Unification Problem 



251 



In Stage I, we nondeterministically apply Conversion or choose an element p 
in r \ X X X and apply one of the following transformations (TT, TL_>, GG, 
VG, VT) to r according to the type of the chosen p. That is, for p = M « fV, 



M \ N 


T\(GUX) 


G 


V 


T\(GUX) 


TT 


TT 


VT 


G 


TT 


GG 


VG 


X 


VT 


VG 


- 



and for p = M xijj N, 



M \ N 


T\{GUX) 


G 


F\{GUX) 


TL^ 


TL^ 


G 


TL^ 


GG if [/ = 0 






TL^ if [7 yf 0 


X 


VT 


VG 



If no transformation is possible, F {fail}. 

Let r' = r \ (pj. We write p~M«A^ if p = M«fV or p = 7V« M. In 
order to help understanding of the transformations, we assume that 6* is a locally 
minimal unifier of p and we list the conditions that are assumed on a proof 7 of 
p9. When applying the transformations we of course lack this information and 
so we just have to check that the conditions of the transformations are satisfied. 



Conversion 

If r C {x L, L Ki x,x X(7 L \ X £ X and L £ T \ G|, then 

r conv(r) 

where conv(T) = {x Xvf P \ x £ X and {x ~ P £ F or P x £ F or x Xy P G 
F)}. Note that conv(T) satisfies the stop condition of Stage I. 

In the following examples, we use the TRS Ri shown in Section 1. 

Example 2. {eq{y , not{y)) « y} {y «vf eq{y , not{y))} 



TT Transformation 

If p ~ M « with M,N ^X and either M £ T\{GU X) or N £ T\{GU X), 
we choose one of the following three cases. Let k = arity(root(M)). We guess 
that there exists a sequence 7 : M9 — X9. 

1. If root(M) = root(A^) then 

r U (M « N} F' U {M\i « iV|, I I < i < fcj 

In this case, we guess that "f : M 9 — N9 is e-invariant. 
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2. If M ^ G then we choose a rule a ^ P £ R that satisfies root(M) = root(a) 
and 



r' U {M « N} r' U decompose(M, a, Ox(a)) U {/3 « N} 

We rename each variable occurring in a to a fresh variable if necessary. 
In this case, we guess the leftmost e-reduction step in 7 : M0 — >•* oct — >■ 
P — NO (where the subsequence MO — >■* aa is e-invariant). 

3. If M G G then we choose a rule a ^ P G R that satisfies M — >■* P and 

r' U {M « N} r' U {/? « N} 

and then do a single transformation on /3 « IV by case 1 or 2 of this TT 

transformation. ^ Note that it is decidable whether or not M — >■* P [9]. In 
this case, we guess the rightmost A-reduction step in 7 : MO — >•* acr — >■ 
P NO. 

Example 3. In case I of the TT transformation, 

{eq{not{x),t) « eq{y, not{y))} {not{x) ^y,t^ not{y)} 

By choosing rule eq{not{x),x) — >■ / in case 2, 

{eq{not{x),t) « eq{y , not{y))} 

decompose{eq{not{x) ,t) , eq{not{x') , x') , {11, 2}) U {/ « eq{y, not{y))} 
= {not{x) «{i} not{x'),t « A, / « eq{y, not{y))} 



TL^ Transformation 

If p = M N with M ^ X and \i M G G then ^ G or G yf 0, we choose one 
of the following three cases. We assume that U yf {e}, since M N can be 
replaced by M « TV. We guess that there exists a sequence j : MO L gg* NO 
for some term L such that for the subsequence 7 ' : L gg* NO and for any 
V G R(j'), there exists u G U such that u< v. 

1. If root(M) = root(N) then 

r' U {M Xj/ N} r' U decompose(M, N, U) 

and if M G G, then apply the VG transformation described later to all 
L fa L' G decompose (M, IV, G) fl (G x X). In this case, we guess that 7 : 
MO L GG* NO is e-invariant. 

^ To prove the termination of the algorithm, each transformation must decrease the 
“size” of r. There are some cases when making one transformation, the “size” of G 
does not decrease. Making two TT transformations successively, we can ensure the 
termination. By the same reason, we make a finite number of successive transforma- 
tions in some cases of the TL_> and VT transformations. 
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2. If M ^ G then we choose a rule a ^ P G R that satisfies root(M) = root(a) 
and 

r' U {M N} r' U decompose (M, a, Ox{a)) U {/? X[/ N} 

We rename each variable occurring in a to a fresh variable if necessary. In 
this case, we guess the leftmost ^-reduction step in 7 : M9 — >■* oct — >■ /3 — >■* 
L o* NO (where the subsequence MO — >■* acr is er-invariant) . 

3. If M G G then we choose a rule a — >■ /3 G i? that satisfies M A /? and 

r' U {M N} r' U {/? Xy N} 

and then transform P N by case 1 of the TL_> transformation. Note 
that it is decidable whether or not M — >■* P [9]. In this case, we guess the 
rightmost A-reduction step in 7 : MO — >■* aa ^ P — >■* L GG* NO. 

Example 4- In case 1 of the TL_> transformation, 

{eq{not{x),t) 21 } eq{y , not{y))} ^ 4 ,^ {not{x) « y,t^{i} not{y)} 

By choosing rule eq{not{x),x) — >■ / in case 2, 

{eq{not{x),t) eg(y, not(y))} ^ 4 ^ 

{not{x) not{x'),t « x'} U {/ 21 } eq{y, not{y))} 



GG Transformation 

1 . If p = M « iV with M, N G G and M I N then 

A U {M « N} ^4^ r' 

Note that it is decidable whether or not M 4 N [9]. 

2. If p = M X 0 iV with M, N G G and M -G* N then 

r' u {M X0 N} ^4^ r' 

Note that it is decidable whether or not M — >■* TV [9]. 

Example 5. 

{eq{f,not{f)) « /} 0 

Note that eq{f,not{f)) I f holds, e.g., eq{f,not{f)) -G- eq{not{t),not{f)) -G 
eq{not{not{f)),not{f)) -G f. 
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VG Transformation 

1. Ifp ~ a: « M with x G X and M G G,we choose an element M' in Lmin{M). 

T' U {a; « M} r'[M'/x] 

2. li p = X X(7 M with x G X and M G G, we choose an element M' in 

Lmin{M). 



r' U {x Xy M} => 4 ,^ r'[M'/x] U {M' Xy M} 

Example 6. By choosing p = {y, /) and M' = / (note that / G Tmm(/))) 

{j/« f,eq{y,not{y)) « /} {eq{f,not{f)) « /} 

VT Transformation 

1. If p ~ {x, M) with X G X and M gT\ (GUX), we choose a rule a ^ (3 G R 
and a position v G 0{M) such that M\y GU X. 

r' U {cc « M} T' U {a; « M|„ « P} 

and if V = £, then apply the VG transformation to a; « /3. In this case, 
we guess the sequence 7 : x9 gg* M6[aa]y A Md[P]y gg* M9 (or x9 gg* 
M9[P]y G- M9[aa]v GG* M9) for some cr and v G Min(7?.(7)). 

2. Ifp = a: Xj7 M with x G X and M G T\(GUV), we choose a rule a ^ P G R 
and a position v G 0{M) such that M\^ ^ GVJ X. 

r' U {x X(7 M} r' U {a: X[// P ^u/v M|„} 

where U' = {u G U \ u\v}, and if u = £, then apply the VG transformation to 
X X0 p. Here, we assume that 7 : x9 -G* x9[aa]y -G x9[P]y gg* M9[P]y gg* 
M9 for some a and v G Min(7?.(7)) where x9[a(j]y -G x9[P]y is the rightmost 
w-reduction and there is no u G G such that u < v. 

Example 1. By choosing v = e and rule eq{not{x),x) -G /, 

{eq{y,not{y)) « y} {y « f,eq{y,not{y)) « /} 

After that we apply the VG transformation for p = (y, /). 

3.2 Stage II 

Below we define the one step transformation <p 2 of Stage II. We write E E' 
if <P2{E) 9 E'. 

We begin with E which is the output of Stage I. Then, we repeatedly apply 
the transformation <^2 until the current E satisfies the stop condition of Stage 
II defined below. We consider all possibilities in order to ensure the correctness 
of the algorithm. If E satisfies this condition, then we check the 0-unifiability of 
E in the Final Stage. 
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Definition 6. Let Fx = {x y \ x,y £ X and x «vf y £ F} and Ft = 
{P Q£F\P^X or X}. We do not distinguish Fx and F \ Ft, 
since x ^ y £ Fx ijf x «vf V € F \ Ft- Let ^Tx ^^6 equivalence relation 
derived from Fx, i-e., the reflexive transitive and symmetric closure of Fx- Let 
equivalence class of x £ X. 

Definition 7. ([12]) F is in solved form if for any x «vf -P, 2/ ~vf Q G Ft, 
{x,y £ X) and {x ^Tx V ^ P = Q) hold. 

The stop condition of Stage II is that F satisfies one of the following two 
conditions. 

(1) For any P «vf Q £ Ft, have P £ X and Q £T, and F is in solved form. 

(2) F = {fail} . 

(Note. P = 0 satisfies condition (1).) 

To describe the transformations used in Stage II, we need the following def- 
initions. 

Definition 8. For a term M we define ][o{M) by 

' }0 otherwise 

For (i,j), {i',j') £ N X M where M is the set of nonnegative integers, we use a 
lexicographic ordering >. This measure is defined to give the number of variable 
positions of term M the highest priority and will be used in Section ) to define 
size(P) for F C Eq. 

Definition 9. For P,Q ^ X, we define function common(P, Q) as follows. 
Let U = Min(Ojf(P) U Ox{Q)) and let V = Minjw £ Ot(P) U Ot(Q)IVu £ 
U. t6|f|. Lf P\y I Q\y holds for any v £ V and P[c, •••, = 

Q[c, • • • , c](„j where c is a constant in G and V \J U = {vi, - ■ ■ ,Vn}, 

then common(P, Q) = true otherwise false. Note that it is decidable whether 
P\v 4 - Q\v [9[. For example, let 



Q = 




p = 




where M,N £ G. Lf M f N and P[c, c, = Q[c, c, 

common(P, Q) =true. 

In Stage II, we first choose an element p in P \ X x X nondeterministically 
and then apply one of the following transformations to P according to the type 
of the chosen p. If no transformation is possible, P {fail}. Let P' = F\{p}. 
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Decomposition 

lip = X «vf P with X £ X and P £T\G and there exists a pair q = y «vf Q G 
Pt such that x V ^tnd P ^ Q and Q ^ G and common(P, Q) then, 

p" u {x «vf P, y ~vf <5} =^<?>2 P" u {y ~vf Q} 

u {P|„ «vf Q\u \ uGU and P|„ G X} 

U {Qlti «vf P\u \ u£U and P|„ ^ X} 

where P" = P' \ {g} and U = Min(Ox(P) U Ox{Q))- Here, we assume that 
#o(P)>#o(Q). 

In this case, if there exists a locally minimal i?-unifier 9 of P, then there exists 
a rewrite sequence 6 : P9 gg* x9 gg* y6 gg* Q6. Since x9 and yO are minimal, 
the sequence 5 has no reduction above the leaves of x9 and y9 by Lemma 1. 
For any reduction position v of the subsequences P9 gg* x6 and yO gg* Q9, we 
have V u for any u G Min(Ox(P) U Ox(Q))- So, we can decompose subgoals 
X «vf P and y «vf Q into P|„ «vf Q\u or QU ~vf P\u- For the termination and 
validity of the algorithm, we leave y «vf Qi whose size is not greater than the 
size of X «vf P, in P- 

Examples. Let P = {p, 9, a; «vf J/} with p = x eq{not{w),t) and 

q = y ~vf sq{z,not{f)). Then, common{eq(riot{w),t),eq{z,not{f))) is true be- 
cause t — >■ not{f) and eq{not{w),t)[c,c](i^ 2 ) = eg(c, c) = eq{z,not{f))[c,c] ( 12 ) 
hold. ^o{eq{not{w),t)) = ^o{eq{z,not{f))) = {(1,2)}^- So, we can make the 
following Decomposition transformation: 

{x «vf eq{not{w),t),y eq{z,not{f)),x «vf y} ^<z >2 
{a: «vf eq{not{w),t), z «vf not{w), x «vf y} 

GT Transformation 

If p = P «vf Q with P £ G and Q £ T\{GU X) and common(P, Q) then 
p' u {P Q} ^^2 P' u {PU «vf Q\u\u£ Ox{Q)} 

VG Transformation 

If p ~ a: «vf P with P £ G and x £ X, we choose an element P' G Lmin{P)- 

P'U{a; «vf P} ^<f 2 F'[P7a;] 

This is similar to the VG transformation at Stage I. 

GG Transformation 

If p = P «vf Q with P,Q £ G and P IQ then 

P' U {P «vf Q} ^<Z>2 P' 

Note that it is decidable whether or not P i Q [9] . 
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3.3 Final Stage 

Let L/ be the output of Stage II. If 1/ is 0-unifiable, then our algorithm answers 
’i?-unifiable’, otherwise fail, i.e., no answer. 

Note that since 0-unifiability is equal to usual unifiability, any unification 
algorithm can be used [3,12]. In fact, since // satisfies the stop condition of 
Stage II, Ff is in solved form, so that it is known that //is unifiable iff // is 
not cyclic [12]. The definition that //is cyclic is given as follows. 

Definition 10. A relation e- >• over X is defined as follows: x y iff there exist 
x' x,y' ^ T\{G\J X) such that x' «vf ^ G // j. and y' G V (P) 

hold. Let e- >•+ be the transitive closure o/e- >•. Then, F is cyclic if there exists 
X G X such that x e- >•+ x. 

We will prove later that //is not cyclic if there exists a locally minimal 
//-unifier of Ff. 

Correctness condition of <P: 

(1) • =^^2 is terminating and finite-branching, and 

(2) Fq = {Mq « Nq} is //-unifiable iff there exist Fi and Ff such that Fq 

A =^^2 ^/> satisfies the stop conditions of Stage I, Ff satisfies the one 
of Stage II, and Ff is 0-unifiable (i.e., not cyclic and Ff yf {fail}). 

Note that since ^ is a nondeterministic algorithm, we need an exhaustive 
search of all the computation from Fq, but it is ensured that we can 

decide whether Fq is //-unifiable or not within finite time by (1) and (2) above. 

Our algorithm can be easily transformed into one which produces a locally 
minimal unifier of Fq iff Fq is //-unifiable, since the information can be obtained 
when VG transformations are made. 



3.4 Example 

We consider //^ = { eq{x,x) — >■ t, eq{not{x),x) -G f, t ^ not{f), f — >• not{t)} 
given in Section 1. For Fq = {eq{y,not{y)) « y}, our algorithm <P can make the 
following computation: 

{eq{y,not{y)) « y)} ^vt {y « f , eq{y , not{y)) « /} 

^VG {eq{f,not{f)) « /} 

^GG 0 

Obviously, 0 satisfies the stop conditions of Stages I and II and is 0-unifiable. 
Hence, our algorithm decides that Fq is //-unifiable. In fact, 9 satisfying yO = f 
is an //-unifier which can be computed by our algorithm. 

Note that {eq{y,not{y)) « y} is transformed into {y «vf C( 7 (y, not(y))} by 
Conversion which satisfies the stop condition of Stages I and II. But {y «vf 
eq{y,not{y))} is cyclic, so this computation sequence fails in the final stage. 
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4 Correctness of Algorithm ^ 

In this section, we give the lemmata needed to conclude the correctness of Al- 
gorithm <P and the main theorem. We only outline the proof of part (1) on the 
correctness condition, i.e., termination of The reader is referred to the full 
version of this paper [13] for the complete proofs. 

Definition 11. For F C Eq, let core(T) = {M '^N\MKiNGF or M Xj/ 
N € F or M N € F}. 

Definition 12. Substitutions 9 and 9' are consistent if x9 = x9' for any x € 
Dom{9) n Dom{9'). 

Definition 13. Let <P : Power(ifo) — >■ Power(Power(ifo)) be a transformation. 
Then, <F is valid iff the following validity eonditions (VI) and (V2) hold. For 
any F C Eq, let f\f) = {A, - • 

(VI) If 9 is a locally minimal R-unifier of F, then there exist t (1 < t < n) and 
a substitution 9' such that 9' is consistent with 9 and 9' is a locally minimal 
R-unifier of Fi. 

(V2) If there exists i {1 < i < n) such that core(A) is R-unifiable then core(T) is 
R-unifiable. 



4.1 Correctness of Stage I 

Lemma 3. Stage I is terminating and finite-branching. 

Proof For F C Eq, we define size(r) as (#i(T), # 2 (T), # 3 ( 1 "), # 4 (P)). 

Here 

#i(r) = Up~Qgr(#o(-P) LI #o(<3)) L (Upx(/Qer#o(-P)) 

#2(A) = Upx(jQer#o((5) 

# 3 (r) = Upx(;Q6p{|w| I u G t/}„ 

#4(r) = \F\ 

We use a lexicographic ordering > to compare size(T) and size(T') for all F, F' G 
Eq. For every transformation ^i(F) = {A> • • • , A} in Stage I, we can prove that 
size(F) > size(A) for every 1 <i <k according to showing the following tables. 





#1 #2 #3 #4 


TT 


> 


case 1 of TL^ 


» » » 


case 2 of TL_> 


> 


case 3 of TL^ 


> > > 



Moreover, if T is a finite set, then k is 
Thus, this lemma holds. 

Lemma 4. Stage I is valid [13]. 





#1 #2 #3 #4 


GG 


= = = > 


VG 


> 


case 1 of VT 


> 


case 2 of VT 


» > 



finite, i.e.. Stage I is finite-branching. 
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4.2 Correctness of Stage II 

Lemma 5. Stage II is terminating and finite-branching. 

Proof. For P C {M «vf N \ M, N G T}, we define the size(F) as ($i(C), $ 2 (C)). 
Here 

$i(r) = Up~^jQgr(#o(^’) LI #o(Q)) 

Un = |r| 

We use a lexicographic ordering > to compare size(F) and size(F') for all P, P' G 
Eq. For every transformation <p 2 {r) = { • • • , ^k} in Stage II, we can show that 
size(F) > size(ri) for every I < i < fc. Moreover, if F is a finite set, then k is 
finite. Thus, this lemma holds. □ 

Lemma 6. Stage II is valid [13]. 

4.3 Correctness of Final Stage 

Lemma 7. Assume that Pf satisfies the stop condition of Stage II. Then, Pf 
is not cyclic if there exists a locally minimal R-unifier 6 of Pf. 

Proof. Let 0 be a locally minimal F-unifier of Pf. Note that for any x,x' G X 
if X x' then x0 gg* x'9 holds, so that H{x9) = H{x'9) holds by local 

minimality of 9. Now, we show that for any x «vf -P G &nd y G V{P), 
if P ^ X, then H{x9) > H{y9) holds. Let y = P\u for some u s. Then 
x9\u GG* y9 holds, since 9 is an i?-unifier of Pf. Local minimality of 9 ensures 
that H{x9\u) > H{y9). Hence, H{x9) > H{y9). It follows that for any x,y G X 
if X t-G y, then H{x9) > H{y9) holds. Therefore, it is impossible that we have 
X X. That is, P f is not cyclic. □ 

Lemma 8. Assume that P f satisfies the stop condition of Stage II. Then, if 
there exists a locally minimal R-unifier of Pf then Pf is fb-unifiable. 

Proof. Obviously, F/ yf {fail}, so that Fy is in solved form. By Lemma 7, Fy is 
not cyclic, so that Pf is 0-unifiable. □ 

(Note. The converse of Lemma 8 does not necessarily hold.) 

5 Conclusion 

Now, we can deduce our main theorem. 

Theorem 1. The unification problem for confluent right-ground term rewriting 
systems is decidable. 

Proof. By Lemmata 3 and 5, part (1) of the correctness condition of <P holds 
and by Lemmata 4 and 6, Stages I and II are valid, so that if Fq = {Mq « No} 
is F-unifiable, then there exist Pi and Pf such that Fq Pi P/, Pi 
satisfies the stop conditions of Stage f, Pf satisfies the one of Stage II, and 
there exists a locally minimal F-unifier of Pf. Hence, by Lemma 8, the only-if- 
part of part (2) of the correctness condition of <P holds. Conversely, the if-part 
is ensured by validity of the transformations of Fi and <p 2 - Thus, part (2) of the 
correctness condition of <P holds. Therefore, decidability of 0-unifiability ensures 
this theorem. □ 
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Abstract. We discuss the termination methods using the higher-order 
recursive path ordering and the general scheme for higher-order rewriting 
systems and combinatory reduction systems. 



I Introduction 

A rewriting system is said to be terminating if all rewrite sequences are finite. 
Many methods to prove termination of first-order term rewriting have been stud- 
ied. For higher-order rewriting, where bound variables may be present, there are 
so far significantly fewer results available. What makes this situation even worse 
is that there are several brands of higher-order rewriting, and it is often not 
immediately clear how to apply or adapt a result obtained in one framework to 
another one. We distinguish here three variants of higher-order rewriting. First 
there are the higher-order rewriting systems (HRSs) introduced by Nipkow m 
Here rewriting is defined modulo /dij of simply typed A-calculus. Second there 
are the combinatory reduction systems (CRSs) introduced by Klop jS|. Third 
there are the algebraic- functional systems (AFSs) introduced by Jouannaud and 
Okada Here the reduction relation of interest is the union of /3-reduction 
and the reduction relation induced by the algebraic rewrite rules (which may be 
higher-order). Matching in an AFS is syntactic (not modulo /3). 

An important method to prove termination of a first-order term rewriting 
system is the one using the recursive path ordering (rpo) due to Dershowitz (Sj. 
Jouannaud and Rubio jOI present a generalization of the recursive path order- 
ing to the higher-order case, in the framework of AFSs. The crucial idea is to 
show well-foundedness of the ordering using the notion of computability from 
the proof of termination of typed A-calculi due to Tait and Girard. The usual 
proof of well-foundedness of rpo, and also the proofs of well-foundedness of sev- 
eral earlier orderings designed to prove termination of higher-order rewriting 

I I n ■ I I lo) 1 rely instead on Kruskal’s tree theorem. Because so far no sufficiently 
expressive higher-order variant of Kruskal’s tree theorem seems to be known, 
those higher-order term orderings don’t have the full power of rpo. 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 261-123 2001. 
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The main purpose of this paper is to make the termination method using the 
higher-order version of the recursive path ordering (horpo) more widely available 
by presenting it for HRSs and CRSs. The fact that horpo can be adapted to prove 
termination of HRSs is already remarked in , and worked out in j7j . Here we 
take a different approach, as explained in Section ^ 

Another method to prove termination of AFSs is due to Jouannaud and 
Okada U, and makes use of the notion of general scheme. The general scheme is 
designed to make the proof of termination of typed A-calculus due to Tait and 
Girard adaptable to the case of the particular AFS. Blanqui |2 studies versions 
of the general scheme for higher-order rewriting with a CRS-like syntax and for 
HRSs. Here we consider a simpler form of the general scheme, closer to the one 
considered in |^. If we consider pure horpo and the pure general scheme, then 
the two methods to prove termination are incomparable, as shown by examples. 
The general scheme can be used to upgrade horpo, as done in In this way 
the power of both methods is combined. 

Finally, we remark that for HRSs there is a semantical method to prove 
termination due to Van De Pol m. using an interpretation (to be given by the 
user) of the function symbols as functionals. 



2 Higher-Order Rewriting 



In this section we briefly recall the syntax higher-order rewriting systems (HRSs) 
as introduced by Nipkow m and combinatory reduction systems (CRSs) as 
introduced by Klop 0. For more detailed accounts we refer to 11411 2I17I8I9I . 
Examples of higher-order rewriting systems in HRS, CRS, and AFS format are 
available at http://www.cs.vu.nl/~femke/papers.html. 



2.1 Higher-Order Rewriting Systems 

In a HRS we work modulo the /3?7-relation of simply typed A-calculus. Types 
are built from a non-empty set of base types and the binary type constructor 
— >■ as usual. For every type we assume a countably infinite set of variables of 
that type, written as x,y, z, . . .. A signature is a non-empty set of typed function 
symbols. The set of preterms of type A over a signature S consists exactly of 
the expressions s for which we can derive s : A using the following rules: 

1. X : A for a variable x of type A, 

2. / : A for a function symbol / of type A in A, 

3. if A = A' — >■ A", and x : A' and s : A", then [x. s) : A, 

4. if s : A' — )> A and t : A', then (st) : A. 

The abstraction operator _. _ binds variables, so occurrences of a: in s in the 
preterm x. s are bound. We work modulo type-preserving a-conversion and as- 
sume that bound variables are renamed whenever necessary in order to avoid 
unintended capturing of free variables. Parentheses may be omitted according 
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to the usual conventions. We make use of the usual notions of substitution of 
a preterm t for the free occurrences of a variable a; in a preterm s, notation 
s[a: := t], and replacement in a context, notation C[t\. We write s 3 s' if s' is a 
subpreterm of s, and use D for the strict subpreterm relation. 

The j3-reduction relation, notation is the smallest relation on preterms 
that is compatible with formation of preterms and that satisfies the following: 

{x. s) t s[x := t] 

The restricted rj-expansion relation, notation is defined as follows. We have 

C[s] C[x. (sa;)] 

if s : A ^ B, and x : A is a fresh variable, and no /3-redex is created (hence the 
terminology restricted 77-expansion). The latter condition is satisfied if s is not 
an abstraction (so not of the form z.s'), and doesn’t occur in C[s] as the left 
part of an application (so doesn’t occur in a sub-preterm of the form (ss')). 

In the sequel we employ only preterms in 77-normal form, where every sub- 
preterm has the right number of arguments. Instead of sqSi . . . Sm we often 
write so(sii ■ ■ • ; Sm)- A preterm is then of the form x\ . . . Xn- so(si, . . ■ , Sm) with 
So(si, • • • , Sm) of base type and all si in 77-normal form. 

A term is a preterm in /3-normal form. It is also in 77-normal form be- 
cause 77-normal forms are closed under /3-reduction. A term is of the form 
x\ . . - Xn- a(si, . . . , Sm) with a a function symbol or a variable. Because the /377- 
reduction relation is confluent and terminating on the set of preterms, every 
/377-equivalence class of preterms contains a unique term, which is taken as the 
representative of that class. 

Because in the discussion we will often use preterms, we use here the notation 
5°’ for the replacement of variables according to the substitution cr {without 
reduction to /3-normal from), and write explicitly for its /3-normal form. 

This is in contrast with the usual notations for HRSs. 

A rewrite rule is a pair of terms {l,r), written as Z — >■ r, satisfying the 
following requirements: 

1. I and r are of the same base type, 

2. I is of the form f{li , . . . , Z„), 

3. all free variables in r occur also in I, 

4. a free variable x in I occurs in the form x{yi , . . . , ?/„) with yi 77-equivalent to 
different bound variables. 

The last requirement guarantees that the rewrite relation is decidable because 
unification of patterns is decidable [El. The rewrite rules induce a rewrite rela- 
tion — on the set of terms which is defined by the following rules: 

1. if s — >■ t then x{. . . , s, . . .) x{. . . ,t, . . .), 

2. if s -)> t then f{...,s,...)^f{...,t,.. .), 

3. if s — >■ t then x. s —>■ x. t, 

4. if Z — >■ r is a rewrite rule and cr is a substitution then Z”’),^— >■ 

The last clause in this definition shows that HRSs use higher-order pattern 
matching, unlike AFSs, where matching is syntactic. 
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2.2 Combinatory Reduction Systems 

We assume a countably infinite set of variables, written as x,y,z, . . .. We make 
use of the notion of arity which is a natural number indicating how many ar- 
guments a symbol is supposed to get. For every arity a countably infinite set of 
metavariables, written as X,Y, Z, . . . of that arity is assumed. A signature is a 
non-empty set of function symbols, each with a fixed arity. The set of metaterms 
over a signature S is defined by the following clauses: 

1. a variable x is a metaterm of arity 0 

2. if / is a function symbol in E of arity m and si, . . . ,Sm are metaterms, then 
/(si, . . . , Sm) is a metaterm of arity 0, 

3. if s is a metaterm of arity m, then [x]s is a metaterm of arity m -I- 1, 

4. a metavariable Z of arity n is a metaterm of arity n 

5. if So is a metaterm of arity m and Si, . . . , s^ are metaterms of arity 0, then 
so(si, . . . , Sm) is a metaterm of arity 0 (a meta-application). 

This definition of metaterm differs from the usual one: a metaterm can be a 
metavariable applied to metaterms, but also an abstraction applied to one or 
more metaterms. In this way the metaterms contain what are usually called 
the substitutes. Another difference is that here metaterms have an arity. The 
abstraction operator [_]_ binds variables, so occurrences of x in s in the metaterm 
[x]s are bound. We write s A s' if s' is a submetaterm of s and use D for the 
strict submetaterm relation. For every m > 1 we have a 6-reduction rule: 

( [xi . . . Xm\ ^o) ("^l 7 ■ ■ ■ 7 ^m) ^ b ^0 [^1 • — Si . . . Xm ■ — ■ 

Because a variable x doesn’t occur in a submetaterm of the form x(si, . . . , Sm), 
an application of the 6-reduction rule doesn’t create new 6-redexes. The relation 
— is like a development; it is confluent and terminating on the set of metaterms. 

A term is a metaterm without metavariables or meta-application. A rewrite 
rule of a CRS is a pair of terms (/, r), written as / — >■ r, satisfying the following: 

1. I and r are closed metaterms of arity 0, 

2. Hs of the form /(^i, . . . , l„), 

3. all metavariables in r occur also in I, 

4. all metavariables in I occur in the form Z{xi , . . . , Xm) with xi, . . . , Xm dif- 
ferent bound variables. 

The restriction concerning the arity in the first clause makes that some rewrite 
rules that fit in the usual definition of a CRS are not allowed here. An example 
is a — >■ [x]a. We use the notation s'' for s where all metavariables are replaced 
according to the definition of the substitution cr. Such a substitution assigns 
terms of arity n to metavariables of arity n. The rewrite rules induce a rewrite 
relation — >■ on the set of terms which is defined by the following rules: 

1. a s^t then /(..., s, t, . . .), 

2. if s — >• t then [x]s — >■ [x]t, 

3. if Z — >■ r is a rewrite rule and cr is a substitution, then r'^h,- 
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3 Computability 

In the following sections we will make use of the notion of computability due to 
Tait and Girard with respect to a relation ^ on terms, preterms, or metaterms. 
Here we give the definition for both the typed HRS case and the untyped CRS 
case where we use the arity of a metaterm. The definition and the properties we 
use later are the well-known ones, also used in 0. 

Definition 1. The expressions of type A (of some arity) that are computable 
with respect to ^ are defined by induction on the structure of A as follows: 

1. If s : B with B a base type (with s of arity 0^ then s is computable with 
respect to ^ ift is well-founded with respect to ^ for all t such that s^ t. 

2. If s is of type Hi A„ — >■ B with B a base type (of arity n), then 

s is computable with respect to ^ if for all computable ui,...,u„ of type 
Hi, . . . , An we have that s{ui , . . . , Un) is computable with respect to 

The following lemma concerns computability with respect to some relation 

Lemma 1. 

1. If s is computable then s is well-founded with respect to >■. 

2. If s is computable and s ^ t then t is computable. 

3. If s : B with B a base type (of arity 0^ and s' is computable for every s' such 
that s ^ s' then s is computable. 

4 . (HRS case) If s[x := u] is computable for every computable u of the right 
type, and {x. s) (u) ^ s[a; := it], then x. s is computable. 

(CRS case) If s[x := it] is computable for every computable u and we have 
([a;]s) (it) ^ s[x := It], then [x]s is computable. 

4 The Higher-Order Recursive Path Ordering 

This section is concerned with the higher-order version of the recursive path 
ordering (horpo). Jouannaud and Rubio define horpo for what we call here 
AFSs. Here we present a method to prove termination of HRSs and a method 
to prove termination of CRSs. Both methods use an adaption of horpo as in |^. 

4.1 Horpo for HRSs 

We assume well-founded precedence > on the set function symbols. We write = 
for the equivalence relation on types induced by identifying all base types. 

Definition 2. We have s t for preterms s : A and t : A' if A = A' and one 
of the following clauses holds: 

1. s = f{si,...,Sjn) 
t = g{ti,...,tn) 
f>9 

for all i € {1 , . . . ,n}: either s >- U or sj >: ti for some j 
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2. S — fi^l: ■ ■ ■ : ^m) 
t = 

(si, • ■ • j Sfji) (^l5 ■ ■ ■ 5 ^m) 

for all i G { 1 , . . . , n}; either s U or Sj ^ ti for some j 

3. S — fi^li ■ ■ ■ 5 ^m) 

Si h t for some Si 

4- S = f (^Si^ . . . ^ Sm) 

t — tg (^1 5 ■ ■ ■ 5 t-n) 

for all i G { 0 , . . . ,n}: either s >- U or sj >: ti for some j 

5. s = x{si , . . . ,Sm) 
t — x(ti , . . . , 

Si >- ti for some i 
Si h ti for all i 

6 . S — 5o(5i, • • ■ , Sm) 
t — tg (^ 1 5 ■ ■ ■ 5 t^ri) 

Si >- ti for some i 
Si h ti for all i 

7. s = X. Sg 
t = X.to 
So >~ to 

The first three clauses are the same as for the first-order case. The difference is 
that here we need to take care of the types: in order to derive s >- t we need that 
the types of s and t are equivalent. If for instance s : A and t : A ^ A then it is 
not possible to compare s and t using )^. The condition ‘for all i either s >- ti or 
Sj ^ ti’ in the clauses IDU and 21 is to be understood as follows: if ti : A' with 
A = A! then s y t, otherwise we have ti : B' and then Sj ^ ti for some Sj : B 
with B = B' . Note further that in clause El we use the notation >- also for the 
multiset or lexicographic (depending on the function symbol /) extension of the 
relation >- on preterms. The clauseEltakes care of substitution. For instance, we 
have f{x.z{x),a) >- (x.z(x))a because x.z(x) ^ x.z{x) and f{x.z(x),a) >- a. 
In clause El it is assumed that sq is not ( 77 -equivalent to) a function symbol or a 
variable and that m > 1 . The clauses El OIOI and Q make that is compatible 
with the structure of preterms. 

Note that horpo is defined on preterms, not on terms. It is not the case that 
s y t implies sf /3 >- tip. For instance {x. a)b>~ (x.a) cif bt>c but not a>- a. 

The termination method using horpo is as follows: a HRS is terminating if 
for every rewrite rule I — >■ r there exists a preterm r' such that I >- r' r. We 
call this the horpo criterion (for HRSs) . 

Example 1. 

1. Consider for example the beta-reduction rule of untyped A-calculus: 

app(abs(x. Z{x)), Z') — )> Z{Z') 

Clause0does not yield that abs(a;. Z{x)) >~ x. Z(x) because abs(a;. Z(x)) has 
type T and x. Z(x) has type T — >• T. 
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2. The following rewrite rule can be shown to be terminating using horpo: 

map(x. F{x),cons{h, t)) cons(F(h), map(a;. F{x),t)). 

We take r' = cons{{x. F{x)){h),map{x. F{x),t)). First, we have x.F{x) F 
x.F{x), and map(a;. cons(/i, t)) y h hy clause 01 and hence we have 
I >- (x.F{x)) (h) by clause 0 Further, cons(h,t) y t, so I >- map{x. F{x),t) 
by clauseEl Using the precedence map > cons we conclude I y r' by clause Q 

3. Horpo cannot be used to show termination of the rewrite rule 

/(a) g{x.a) 

with f : A ^ A and g \ {A ^ A) ^ A because there is no subterm in the 
left-hand side to deal with the subterm x. a of functional type. 

In the remainder of this section we show that the condition I r' r for 
every rewrite rule I — >■ r indeed guarantees termination of the HRS. We make 
use of the notion of computability with respect to U — >-/ 3 . The following lemma 
follows immediately from the definition of 

Lemma 2. If s > t for preterms s and t then C[s] C\f\. 

The proof of the following lemma makes use of induction on a triple as in [^ . 
Lemma 3. If si, . . . , Sm are computable, then /(si, . . . , Sm) is computable. 

Proof. The proof proceeds by induction on triples of the form (/, (si, . . . , Sm), n) 
with / a function symbol, si,...,Sm computable preterms, and n a natural 
number, ordered by (>, U >). This ordering is also written as >. 

Let si, . . . ,Sm be computable preterms. Suppose that /(si, . . . , Sm) >- t or 
f{si, . . . ,Sm) — 1/3 t. We show that t is computable. The following cases are 
distinguished. 

1. t = gfti , . . . , tn) with f > g and for alH G {1, . . . , n}: either s y ti or Sj ^ ti 
for some j. 

If s>- ti, then because (/, (si, . . . , s„), |t|) > (/, (si, . . . , s^), |U|) we can ap- 
ply the induction hypothesis and conclude that ti is computable. If Sj >: U for 
some j, we have that ti is computable because computability is closed un- 
der and Sj is by assumption computable. So all the U are computable. 
Suppose that g{ti,...,tn) F u or g{ti,...,tn) u. Because we have 
(/, (si, . . . , Sm), |t|) > (ff, (fi, • ■ ■ , tn), |u|), the preterm u is computable by 
the induction hypothesis. Hence t is computable. 

2. t = f{ti, ...,tnf) with (si, . . . , Sm) {ti , . . . ,tm) and for all z G {1, . . . ,m}: 
either s ti or sj >: ti for some j. 

We can show as in the previous case that all ti are computable. Suppose 
that ffti , . . . , tm) F u or f{t \, . . . , tm) — 1/3 u. The preterm u is computable 
because (/, (si, . . . , Sm), t) > (/, (ti, . . . , tm),u), Hence t is computable. 

3. Si F t for some s^. 

Because Si is computable with respect to U — >-/3 by assumption, and com- 

putability is closed under >-, also t is computable. 



268 



F. van Raamsdonk 



4. i , tn) with for all i € {1, . . . , n}: either s >- ti or Sj ^ ti for a j. 

As before, we can show that all ti are computable. By the definition of 
computability we have that t is computable. 

5. t = /(Sl, . ..,Si,.. .,Sm) with Sj ~^/3 s'i- 

Suppose that /(..., s', .. .) u or /(..., s', . . .) — u. Because we have that 
(/, (si, . . . , Sm),t) > (/, (si, . . . , s', . . . , Sm),u), it follows fi'om the induction 
hypothesis that u is computable. This yields that t is computable. 



Lemma 4. If a is a computable substitution, then s'^ is computable. 

Proof. By induction on the definition of preterms using Lemmas 0 and ^ 

A consequence of this lemma is that all preterms (and hence all terms) are 
computable. That means that there is no infinite sequence of preterms sq U — 

Sl U — >-/3 S 2 U — . . . where every step is either or — Now the aim is to 
use this to show that there is no infinite sequence of terms sq — )> si — >■ S 2 — >■ . . . 
with — >■ the rewrite relation of a HRS satisfying the horpo criterion. 

Lemma 5. Let I ^ r be a rewrite rule with I d r for some preterm d. Let 
u be a substitution. Then there exists a preterm u such that l'^ip>- u ->*p r‘^fp. 

Proof. The proof proceeds by induction on |^| + |d|. We distinguish cases accord- 
ing to the definition of )^. Let I = f{li , . . . , Im)- 

1. d = g{di , . . . , dn) with f>g and for alH G {!,..., n}: either I >- di or Ij ^ di 
for some j. 

We have r = g{ri, . . . ,rn) with = difp. li I di then by the induction 
hypothesis a preterm Ui with l'^fp>- Ui ->*p rffp exists. If Ij ^ then by the 
induction hypothesis a preterm Ui with Ijfp'^ Ui rffp exists. Hence we 
have l'^ip= f{lfip, ..., Ifn^ip) >- g{ui, ...,Un)^p g{rfip, ..., rf,ip) = r'^fp. 
So take u = g{ui , . . . , rt„). 

2. d = f{d \, . . . , dm) with (Zi, . . . , Im) >~ (di, • ■ • , dm) and for alH G {1, . . . , n}: 
either I di or Ij ^ di for some j. 

We have r = /(ri, . . . , rm) with = difp. For both the lexicographic and the 
multiset extension of the existence of suitable preterms Ui follows from the 
induction hypothesis. Then l‘^fp= f{lffp,...,lmfp) >- f{ui, . . . ,Um) ~^p 
f{rfip, . . .,rf,ip) = r'^ip. So we take u = /(mi, . . . ,Um)- 

3. li> d for some h. 

By the induction hypothesis there is a preterm u with lf\.p>L u -^p r'^fp. 
Hence l'^fp>- u -^p r'^fp. 

4. d = do(di, . . . , d„) with for all i G {!,..., n}: either I >- di or Ij ^ di for 
some j. 

We have r = do(di, . . . ,dn)ip. By the induction hypothesis there exist ufs 
such that for every i we have either l‘^fp>- Ui -»p {{difp)‘^)Xp or Ijfp'^ Ui ->*p 
{{diip)'^)ip. We have l'^ip= f{lfip,...Jmip) >~ uo{ui, . . . ,Un) ^p r'^fp 
because r'^fp= ((do(di, . . . ,dn))ip)'^ip which equals the /3-normal form of 
{{doip)ni0 (((dii/3)");;3, . . . , {{dnip)^)ip). 
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5. I = x{l\, . . . , Im) and d = x{di, . . . , dm) with li >- di for some i and li ^ di 
for all i. 

Because all li are ? 7 -equivalent to different bound variables, we can only have 
I ^ d because I = d. Then also d'^i/ 3 - 

6. I = X. Iq and d = x.do with Iq >- do. 

We have r = x. doi/3- By the induction hypothesis a preterm uq exists such 
that loij3>- uo ->*/3 doi/3- Hence x.uq 

The previous lemma doesn’t hold if the left-hand side of a rewrite rule is not a 
pattern. Consider for example the would-be rewrite rule f(z(a)) — >■ f(z(b)). We 
have f(z(a)) y f(z(b)) using the precedence a>b. However using the substitution 
a = {z i-y X. a} we do not have /(z(a))‘’’4-/3= /(a) ^ /(a) = f(^(b))'^-l/3- 

Lemma 6. If s ^ t in a HRS satisfying the horpo criterion then there exists a 
preterm u such that s >- u and u t. 

Proof. By induction on the definition of the rewrite relation using Lemma 0 

Theorem 1. A HRS satisfying the horpo criterion is terminating. 

Proof. Suppose that we have an infinite rewrite sequence sq si S 2 — 

By Lemma El So ^ Wg ->*i 3 Si Ui S 2 . . .. This contradicts Lemma0 

A problem in proving termination for HRSs, also discussed in is that a 
relation that is both monotonic and closed under /3-reduction is reflexive and 
hence not well-founded. For instance if 6 > c, then monotonicity yields (x. a) b > 
(x. a) c, and closure under /3-reduction yields a > a. There are different ways to 
deal with this problem. 

In 0, the starting point is horpo for AFSs (here also written as )^). Then 
a subrelation > of is given that is /3-stable. This means that I > r implies 
Hfpy r'^fp. Because is monotonic, this yields C[Hfp\ y C[r'^ip\. Since is 
well-founded, this yields termination of the rewriting relation. The method to 
prove termination is hence: show that I > r for every rewrite rule I -y r. The 
relation > is obtained from by restricting the clauses dealing with application, 
in order to make > /3-stable. A consequence of this restriction is that we do not 
have map(x. F(a:), cons(/i, t)) > F{h). So it seems that this ordering cannot be 
used to prove termination of the HRS for map. 

The approach taken here is different: it is shown that if / d -»-p r, then 
H ipy u -^p r'^ fp. This implies that C[H fp\ y C[u\ -^p Because 

U -yp is well-founded, this yields termination of the rewriting relation. 



4.2 Horpo for CRSs 

A well-founded precedence > on the set of function symbols is assumed. 

Definition 3. We have s y t for metaterms s and t if s and t have the same 
arity and one of the following clauses holds: 
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1. s = f{si,...,Sm) 
t = g{ti, 

f>9 

for all i G {1, . . . , n}; either s U or Sj ^ ti for some j 

2. S — fi^li ■ ■ ■ 5 ^m) 
t = f{ti,...,tm) 

(si, ■ ■ • j Sm) (^1; ■ • ■ 5 tm') 

for all i G {1, . . . , n}; either s y ti or sj ^ ti for some j 

3. S — fi^li ■ ■ ■ 5 ^m) 

Si>- t for some i 
4- s = /(si, . . . , Sm) 

t — tg (^ 1 5 ■ ■ ■ 5 ^n) 

for all i G {0, . . . ,n}: either s y ti or Sj >: ti for some j 

5. s = Z{si,...,Sm) 
t = Z{ti,. ..,tm) 
for some i: Si >- U 
for all i : Si >z U 

6. S = Sq(5i, . . . , s^) 

t = to{t\, . . . , tm) 

for some i: Si>- U 
for all i : Si y U 

7. s= [a:] So 
t = [x]to 
So to 

The method to prove termination of a CRS is similar to the one for HRSs: a 
CRS is terminating if for every rewrite rule I — >■ r there exists a metaterm r' 
such that I >- r' -»{, r. We call this the horpo criterion (for CRSs). 

Example 2. 

1. Consider the beta-reduction rule of untyped A-calculus: 

app(abs([a;]Z(a:)), Z') — >• Z{Z'). 

We do not have 3bs{[x]Z{x)) >- [x\Z{x) because 3bs{[x]Z{x)) has arity 0 and 
[x]Z{x) has arity 1. 

2. Consider the following rewrite rule from the CRS for map: 

map([a:]F(x), cons(_ff, T)) — cons(F(iJ), map([a:]T’(a:), T)). 

We take r' — cons(([a;]F(a;)) (i?), map([x]F(a;), T)). Since [cc]T’(a;) ^ [x]F{x) 
and I ^ H, we have I >- {[x]F{x)){H). Further, map([x]F(cc), cons(i/, T)) 

T and hence using the precedence map t> cons we have I r' by claused 

3. Horpo cannot be used to show termination of the rewrite rule 

/(o) -)> g{[x]a) 

because there is no subterm in /(a) to deal with [a;]a which has arity 1. 
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Now we can show that the horpo criterion indeed guarantees termination of the 
CRS. We use computability with respect to U -^b- The proofs and auxiliary 
results are similar to the ones for the HRS case. That is, we show the following: 

• If CT is computable, then is computable. Therefore all metaterms without 
metavariables (but possibly with meta-applications) are computable. The 
key step is again to show that a metaterm /(si, . . . ,Sm) is computable if 
si, . . . , Sm are computable. This is shown by induction on a triple. 

• If s — )> t then there exists a t' such that s > t' and t' -»{, t. 

This is used to prove the following result. 

Theorem 2. A CRS satisfying the horpo criterion is terminating. 



5 The General Scheme 

The general scheme states conditions on the right-hand side of a rewrite rule 
that guarantee a termination proof a la Tait and Girard to work. There occur 
several incarnations of the general scheme in the literature. The first one is due to 
Jouannaud and Okada 0. In many later works different versions of the general 
scheme (depending on the form of the AFS and its typing system) are considered. 
For instance termination of the calculus of constructions and algebraic rewriting, 
proved using the general scheme, is shown in PJ- In |2| the general scheme is used 
to prove termination of IDTSs, which are typed higher-order rewriting systems 
with a CRS-like syntax. Also a HRS version is given. Here we present two versions 
of the general scheme: one for HRSs and one for CRSs. They are simpler than 
the ones in |2j. Another difference is that here we consider CRSs and not IDTSs. 
The general schemes used here are close to the one presented for AFSs in |0|; 
the main difference is in the treatment of /3-reduction. 

5.1 The General Scheme for HRSs 

We assume a well-founded ordering > on the set of function symbols. 

Definition 4. Let s = /(si, . . . , Sm) o-nd let X be a set of variables not occurring 
free in s. We have t € C{s,X) for a preterm t if one of the following clauses 
holds: 

1. t = g{t \ , . . . , tri) 
f>9 

U G C(s, X) for all i 

2. t = f{ti,...,tm) 

(si, . . . , s.fji) Z) (ti, . . . , t.fji) 

U G C(s, X) for all i 

3. t = Si for some i 

4- t C Si for some i, with t of base type, and all variables oft occur free in s 
5. t = x€X 
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6. t — tg (^1 5 ■ ■ ■ 5 ^n) 

ti G C(s, X) for all i 

7. t = X. to 

to G C(s, X U {x}) (x not free in s) 

In clause 0 we use D for the multiset- or lexicographic extension of D. We write 
C(s) for C(s,0). Note that we do not include /3-reduction. 

The termination method using the general scheme works as follows: a HRS 
is terminating if for every rewrite rule I — >■ r there is a preterm r' such that 
r' G C(/) and r' r. We call this the general scheme criterion (for HRSs). 
Example 3. 

1. It is not possible to show that the beta-reduction rule of untyped A-calculus 

app(abs(a:. Z{x)), Z') Z{Z') 

is terminating. The preterm Z{x) is of base-type but contains a free variable 
(a:) that is not free in the left-hand side of the rewrite rule. So clause0does 
not yield that Z{x) G C(app(abs(a:. Z(x))),X) for any X. Further note that 
the variable Z (or its 77 -expanded form) is not in C(/) because of its type. 

2. Using the general scheme we can show termination of the rewrite rule 

map(a:. F{x),cons{h, t)) cons{F{h), map(a;. F{x),t)). 

We take r' = cons((a;. F{x)){h), map(a;. F{x), t)). We have x. F{x) G C{1) by 
clause0and h G C{1) by clause0 and hence (x.F{x)) h G C{1) by clauseEl 
Further, t G C(/) by clause 0 and hence map{x. F{x),t) G C{1) by clause 0 
Now we conclude by clause 0 using the precedence map > cons. 

3. The following rewrite rule cannot be shown to be terminating using horpo: 

/(a) g{x.a). 

It can be shown to be terminating using the general scheme. We have a G 
C{f{a),{x}) by clause |3|and hence x.a G C(/(a)) by clause[3 Then, using 
the precedence f > g, we have g{x. a) G C(/(a)) by clause0 

4. It is not possible to show termination of the rewrite rule 

/(a) ^ fib) 

using the general scheme. Note that clause 0 cannot be applied. If 6 is a 
constructor, termination follows using the general scheme as in 0. However, 
if b is not a constructor (for instance if also the rule 6 — )> c is present) this 
version of the general scheme cannot be used anymore either. 

Now we show that the general scheme criterion indeed guarantees termination 
of the HRS. We use computability with respect to U Here the relation 
is defined as the smallest one that is closed under preterm formation, and 
that contains H 4 -/ 3 ^ for every rewrite rule I — >■ r. Also in [?SI I Pj such a 
decomposition of the rewrite step is used to get more grip on the rewrite relation. 
The aim is to show that all preterms are computable with respect to U — ^^ 3 . 

The development is similar to the one in 0 and consists of the following steps: 
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• If Hs the left-hand side of a rewrite rule and t G C{1), then G C{l°'],p). 

• If si, . . . , Sm are computable, then /(si, . . . , Sm) are computable. We show 
that t is computable for every t such that /(si, . . . , Sm) ^ U t. This is 
done by induction on a pair. In case the reduction takes place in an we 
use the induction hypothesis. In case the reduction takes place at the root, 
we use the information that r G C(Z) for every rewrite rule. 

• We conclude that all preterms are computable with respect to ^ U — >-/ 3 . 

This is used to prove the following result. 

Theorem 3. A HRS satisfying the general scheme criterion is terminating. 
Proof. A rewrite step s — >■ t can be decomposed as s u —>^13 t. Termination 
follows because all preterms are computable with respect to U ~^/3. 

5.2 The General Scheme for CRSs 

We assume a well-founded precedence > on the set of function symbols. 
Definition 5. Let s = /(si, . . . , Sm) and let X be a set of variables not occurring 
in s. We have t G C(s, X) if one of the following clauses hold: 

1. t = g{ti, ...,tn) 

f>9 

ti G C(s, X) for all i 

2. t = f{ti, ...,tm) 

(si , . . . , Sm) Z) (ti , . . . , tm) 

U G C(s, X) for all i 

3. t = Si for some i 

4- t C Si for some i, all variables in t occur in s, and t is of arity 0 

5. t = xGX 

6. t — tg (tl 5 ■ ■ ■ 5 in ) 

ti G C(s, X) for all i 

7. t = [a;]to 

to G C(s, X U {x}) 

In clause El we use D for the multiset- or lexicographic extension of D. Again we 
write C(s) for C(s,0). Note that we do not include ^-reduction. 

The termination method using the general scheme is as follows: A CRS is 
terminating if for every rewrite rule I — >■ r there is a metaterm r' such that 
r' G C(/) and r' r. We call this the general scheme criterion (for CRSs). 
Example 4- 

1. Consider the beta-reduction rule for untyped A-calculus: 

app{abs([x]Z(x)), Z') — >• Z{Z'). 

Clause 0 does not yield that [x]Z{x) G C(app(abs([x]Z(x)), Z')) because 
[x]^(x) has arity 1. Also we do not have Z G C(app(abs([x]Z(x)), Z')). Fur- 
ther, clause El does not yield that Z{x) G C(app(abs([x]Z(x)), Z')) because 
the variable x is not free in app(abs([x].Z(x)), .Z'). 
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2. Using the general scheme we can show termination of the rewrite rule 

map([a;]F(x),cons(i/, T)) — >• cons(F(iJ), map([a;]F(a:), T)). 

We take r' = cons([a;]F(a;)(iJ), map([a;]U(a;), T)). Then we have r' G C{1). 

3. Using the general scheme we can show termination of the rewrite rule 

f{z) 5 ([a:]a). 

We have a G C(/(a), {x}) and hence [x]a G C(/(a)). Take r' = f/([x]a), then 
r' G C(/(a)). This rule cannot be shown to be terminating using horpo. 

4. Using the general scheme we can show termination of the rewrite rule 

f{[x]Z{x)) Z{[y]y). 

We take r' = {[x]Z{x)) {[y]y)- Then r' G C{f{[x]Z{x))) by clause 0 because 
[x]Z{x) G C{f l[x]Z{x))) by clause 0and [y]y G C(/([x]Z(x))) by clause[71 

It can be shown that the general scheme criterion indeed guarantees termination 
of a CRS as in the previous subsection. We consider computability with respect 
to U — >■& where the -w is defined as the smallest relation that is closed under 

term formation and that satisfies: for every rewrite rule I — >■ r. 

Theorem 4. A CRS satisfying the general scheme criterion is terminating. 

5.3 Horpo and the General Scheme 

In jOj, horpo is made into a stronger ordering by using also the general scheme. 
This can be done here as well. Then the last line of the conditions in the clauses 
ni2| and 0 becomes the following: for all i: either s C or Sj ^ ti for some j or 
ti G C(s). This stronger ordering can for instance be used to prove termination 
of the rewrite rule /(a) — > f{x. h) using the precedence a>6. For this rewrite rule, 
neither pure horpo nor the pure general scheme can be used to prove termination. 
We leave a further study of this issue to future work. 

Acknowledgements. I am grateful to the anonymous referees for their re- 
marks. 
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Abstract. Matching is a solving process which is crucial in declarative 
(rule-based) programming languages. In order to apply rules, one has 
to match the left-hand side of a rule with the term to be rewritten. In 
several declarative programming languages, programs involve operators 
that may also satisfy some structural axioms. Therefore, their evaluation 
mechanism must implement powerful matching algorithms working mod- 
ulo equational theories. In this paper, we show the existence of an equa- 
tional theory where matching is decidable (resp. finitary) but matching 
in presence of additional (free) operators is undecidable (resp. infinitary). 
The interest of this result is to easily prove the existence of a frontier 
between matching and matching with free operators. 



1 Introduction 

Solving term equations is a ubiquitous process when performing any kind of 
deduction. For instance, even for performing a basic rewriting step, one has to 
solve an equation between the left-hand side of a rule and a term in which the in- 
stantiation of variables is forbidden. This is called a match-equation. Due to the 
increasing interest of rewriting |7J for specifying and programming in a declara- 
tive way, it is crucial to be able to design efficient and expressive matching al- 
gorithms. Many declarative programming languages use rewriting and its exten- 
sions as operational semantics. These languages enable the programmer to write 
rule-based specifications/programs involving operators which, for the sake of 
expressivity, may satisfy some structural axioms like associativity, associativity- 
commutativity, distributivity, etc... Consequently, the underlying matching al- 
gorithms should be able to work modulo equational theories. Moreover, since 
different equational theories must be implemented in the same framework, we 
naturally face the problem of combining equational theories. 

The combination problem for the union of theories has been thoroughly 
studied during the last fifteen years and this has led to several combination 
algorithms for unification |1, 411411 1I2I1[ and matching |bl9l5j in the union of 
signature-disjoint equational theories. 

In the context of combining decision algorithms or solvers for a union of theo- 
ries, some interesting and famous problems remain open. For instance, the com- 
bination algorithm given in Q for unification needs as input decision algorithms 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 276- T!^ 2001. 

© Springer- Verlag Berlin Heidelberg 2001 



Matching with Free Function Symbols 277 



for unification with additional free function symbols. Combining an equational 
theory with the empty theory generated by some additional free function sym- 
bols is the simplest case of combination we can consider. Currently, it remains 
to be shown whether there exists an equational theory for which unification with 
free constants is decidable (finitary) but unification with free function symbols 
(ie. general unification) is not. Similarly, an interesting question arises for the 
problems of matching with free constants and matching with free function sym- 
bols (ie. general matching): is it possible that one is decidable (finitary) but 
not the other? Here, we give an answer to this question: Yes, definitely. Thus, 
the integration of a matching algorithm into the implementation of a declara- 
tive programming language or a deduction system can be jeopardized, since we 
often need to consider additional free function symbols possibly introduced by 
the programmer or the end-user of the system. We cannot expect a universal 
method for this integration. 

The paper is organized as follows. Section El introduces the basic notations 
and concepts of general unification and general matching. Given a regular and 
collapse-free equational theory E, we present in Section El an equational theory 
Te which is a conservative extension of if. A combined matching algorithm for 
Te is described in Section El and we show that general rg-matching is as diffi- 
cult as if-unification. Based on different instances of Te, we exhibit in Section El 
an equational theory where matching is finitary (resp. decidable) whilst general 
matching becomes infinitary (resp. undecidable). Finally, we conclude in Sec- 
tion 0with some final remarks on the difference between unification (with free 
constants) and unification with free function symbols. By lack of space, some 
proofs are omitted in this version but can be found in cni. 

2 General Unification and General Matching 

In this section, we introduce the main definitions and concepts of interest for 
this paper, as well as already known results about these concepts. Our notations 
are compatible with the usual ones j4l5j . 

2.1 Definitions 

A first-order (finite) signature A is a (finite) set of ranked function symbols. The 
rank of a function symbol / is an integer called arity and denoted by ar{f). A 
function symbol c of arity 0 is called a constant. The denumerable set of variables 
is denoted by X. The set of A-terms, denoted by T(A, A) is the smallest set 
containing X and constants in A, such that f(ti, . . . , t„) is in T(A, X) whenever 
f € S, ar{f) = n and ti S T(A, X) for i = 1, . . . , n. The terms t\^, t[u]ui and 
t[u! u] denote respectively the subterm of t at the position to, the term t 
such that t\i^ = u, and the term obtained from t by replacing the subterm t\^ 
by u. The symbol of t occurring at the position lu (resp. the top symbol of t) 
is written t(w) (resp. t(e)). A A-rooted term is a term whose top symbol is 
in A. The set of variables of a term t is denoted by Var(t). A term is ground 
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if it contains no variable. A term is linear if each of its variables occurs just 
once. A A-substitution cr is an endomorphism of T{S,X) denoted by {xi !->■ 
ti, . . . , I— >■ if there are only finitely many variables Xi, ... ,Xn not mapped 

to themselves. Application of a substitution cr to a term t (resp. a substitution (j)) 
is written ta (resp. (f>a). A substitution a is idempotent if cr = crcr. We call domain 
of the substitution cr the set of variables T>om{a) = {x\x € X and xu yf x}. 
Substitutions are denoted by letters cr, /r, 7 , (/),.. . 

Given a first-order signature A, and a set E of A-axioms (i.e. pairs of A- 
terms, denoted by / = r), the equational theory =_e is the congruence closure of 
E under the law of substitutivity. The equational theory is regular if Var{l) = 
Var(r) for all? = r in E, linear if l,r are linear for all / = r in A and collapse- 
free if there is no axiom I = x in E, where I is a non-variable A-term and x is 
a variable. Despite of a slight abuse of terminology, E will be often called an 
equational theory. The corresponding term algebra is denoted by T(A, A)/ =e. 
We write t < — >e t' if t = u[la]u; and t' = u[ra]ui for some term u, position w, 
substitution cr, and equation I = r in E (or r = 1). We assume that A is the 
signature of A, that is the signature consisting of all function symbols occurring 
in A. This means that there are no free function symbols in the signature of 
A. Let C (resp. T) be a denumerable set of additional constants (resp. function 
symbols) such that A fl C = 0 (resp. A fl A = 0). Function symbols in C and 
T are free with respect to A. The empty theories generated respectively by C 
and T are denoted by £ and J. In this paper, we are interested in studying and 
comparing the unions of equational theories A U £ and A U 5^. 

A substitution (j) is an A-instance on A C A of a substitution a, written 
(and read as cr is more general modulo A than (f> on V), if there exists 
some substitution p, such that Vx G A, X(p =e xcr/r. 

An A-unification problem is a conjunction of equations ip = Afceif ='e 
such that are A-terms. A substitution cr is an A-solution of if Vfc G 

K, SkU =E tk<J. Given an idempotent substitution a = {xk >— >■ tk}keK, d is the 
A-unification problem ~e ^k called in solved form. 

Two A-unification problems are equivalent if they have the same set of A- 
solutions. The set of A-solutions SUE{ip) niay be schematized in the compact 
form provided by the notion of complete set of A-solutions, denoted by CSUe{>p) 
and based on the subsumption ordering for comparing substitutions. A 

complete set of most general E-solutions is a complete set of solutions p,C SU E{ip) 
whose elements are incomparable with The set p,CSUE{ip) needs not 

exist in general. Unification problems can be classified according to the existence 
and the cardinality of pLCSUE{ip). The A-unification problem ip is (of type) 
nullary if p,CSUE{ip) does not exist. The A-unification problem ip is (of type) 
unitary, (resp. finitary, infinitary) if p,C SU e(sp) exists and is at most a singleton 
(resp. finite, infinite). A class of A-unification problems is unitary (resp. finitary) 
if each A-unification problem in the class is unitary (resp. finitary), and a class 
of A-unification problems is infinitary if there exists an A-unification problem in 
the class which is infinitary, and if no problem in the class is nullary. A class of A- 
unification problems is decidable if there exists a decision algorithm such that for 
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each if in this class it returns yes or no whether CSUe{^) is non-empty or not. 
In this paper, existential variables may be introduced in unification problems, to 
represent new variables, also called fresh variables. These new variables behave 
like other ones, except that their solutions can be eventually omitted by using 
the following transformation rule (where x stands for a sequence of variables): 

EQE 3v,x : ip A V ='^ t 3 X : ip 
if u ^ Var{p) U Var{t) 



The set of (free) variables occurring in a unification problem ip is denoted by 
Var{ip). 

Definition 2.1. An E-unification problem with free constants is an if U C- 
unification problem. A general E -unification problem is an ifUS^-unification prob- 
lem. An E-matching problem is an if U C-unification problem ip = AkeK ='eu€ 
tk such that tk is ground for each k G K. A general E-matching problem is an 
if U 5^-unification problem ip = Ak^K =eu 5 ^k such that tk is ground for each 
k G K. An E-word problem is an ifU£-unification problem ip = AkeK ^k =‘euc ^k 
such that Sfe, tk are ground for each k G K. A general E-word problem is an if U5^- 
unification problem ip = Ak^K ^k such that Sk,tk are ground for each 

kGK. 

(General) if-matching problems are often written ip = AkeK ^k 1£^e ^k- (General) 
if-word problems are often written ip = AkeK ^k =b tk- 

Biirckert has shown in the existence of a theory where unification does not 
remain decidable if we consider additional free constants. On the other hand, for 
regular and collapse-free theories, we know that a unification algorithm with free 
constants can be derived from a unification algorithm by using the combination 
algorithm designed for example in H3 

In the paper, we are interested in the possible difference between matching 
(with additional free constants) and matching with additional free symbols (not 
only constants). 



2.2 The State of the Art 

We are now ready to present the results already known about general unification 
and general matching modulo an equational theory. 

The next proposition is a consequence of the combination algorithm known 
for the union of arbitrary disjoint equational theories. 

Proposition 2.2. PJ For any (disjoint) equational theories Ei and E 2 , gen- 
eral ifi-unification and general if 2 -unification are decidable iff general Ei U E 2 - 
unification is decidable. 

Now, the problem is to find how to get a general unification algorithm starting 
from a unification algorithm with free constants. 
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Proposition 2.3. m If is a regular equational theory, then an if-unification 
algorithm with free constants can be extended to a general if-unification algo- 
rithm. 

A similar result is known for the matching problem, which is a specific form 
of unification problem with free constants. 

Proposition 2.4. 0 If if is a regular equational theory, then an if-matching 

algorithm can be extended to a general if-matching algorithm. 

For the matching problem, we may have a better result in the sense that 
decidability of matching is sufficient for some theories, and we do not need to 
know how to compute a complete set of solutions. 

Proposition 2.5. |0| If if is a linear or (a regular and collapse-free) equational 
theory, then general if-matching is decidable iff if-matching is decidable. 

Here are two examples of regular and collapse-free theories: 

{ X + {y + z) = {x + y) + z 
X * {y + z) = x*y + x*z 
{y + z) * X = y*x + z*x 

A is linear but DA is not. A-unification is infinitary whilst there exists a finitary 
A-matching algorithm. In the same way, iiA-unification is undecidable whilst 
there exists a finitary ZiA-matching algorithm, and so DA-matching is decidable. 

Proposition initially proved for unification does not hold anymore if uni- 
fication is replaced by matching. 

Proposition 2.6. There exist (disjoint) equational theories Ei, E2 such that 
general ifi-matching and general if2-matching are decidable (resp. finitary) but 
general E\ U if2-matching is undecidable (resp. infinitary). 

Proof. Corollary of a counter-example given in jHj. Let us consider the decid- 
ability problem. Let Ei be the theory DA and E2 be the nilpotent theory 
E2 = {a; © a; = 0}. General ifi-matching is decidable since Ei is regular 
and collapse-free and ifi-matching is decidable 0. On the other hand, gen- 
eral if2-matching is decidable since general if2-unification is decidable by using 
well-known (basic) narrowing techniques for if2-unification ^ as well as for E2- 
constant elimination El . But general ifiUif2-matching is undecidable since DA- 
unification with free constants is undecidable 1121 and any DA-equation s t 
is equivalent to the general if 1 U if2-match-equation /(s) ©/(t) ^^6re 

/ is an additional free function symbol. 

The same kind of proof is possible if we consider the type of unification and 
A instead of DA. □ 

At this stage, a natural question arises: could we prove the existence of a the- 
ory where unification (resp. matching) with free constants is decidable/finitary 
whilst unification (resp. matching) with free symbols becomes undecidable/infi- 
nitary? The counter-example presented in |Hj permits to prove Pronosition 12.61 
However, it does not give an answer to the previous question in the case of 
matching. As an answer to this question, we prove in this paper the existence of 
equational theories for which matching is finitary (resp. decidable) and general 
matching is not. These theories are rather simple extensions of A and DA, and 
have similarities with the counter-example presented in 0. 
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3 A Non-disjoint Union of Theories 

We present an equational theory T which is of greatest relevance for proving the 
difference between matching and general matching. We study its combination 
with other equational theories E. 

Definition 3.1. Let T be the equational theory defined as follows: 



h(i(x),i(x)) 


= 0 


h(l, x) 


= 0 


h(x, 1) 


= 0 


i{h{x,y)) 


= 1 


i{i{x)) 


= 1 


*(0) 


= 1 


*(1) 


= 1 



Given an equational theory E such that Ee r\ Et = Te denotes the 
following equational theory: 

Te = TU ED . ,,Xar(f))) = 1}/gi:e 

where x±, . . . , Xar(f) are distinct variables in X . In the following, E^ denotes the 
union of the set of free constants C and the signature oi Te, whereas E denotes 
the signature of E. The set of i-terms is defined by: 

IT{E,X) = {t(t) I t G T{E,X)} 

If E is represented by a confluent and terminating TRS R, then Tr denotes 
the TRS obtained from Te by replacing E hy R and by orienting axioms given 
above from the left to the right. 

Assumption 1 E is a regular and collapse-free equational theory represented 
by a confluent and terminating TRS R. 

Proposition 3.2. Tr is a confluent and terminating TRS. 

We introduce the basic notations and technicalities to state some useful prop- 
erties about the theory Te seen as a union of theories. More precisely, we want 
to show how any TR-equality between heterogeneous A^-terms can be reduced 
to an if-equality between pure Li-terms. To this aim, the notion of variable ab- 
straction allows us to purify heterogeneous terms by replacing alien subterms 
with abstraction variables. 

Definition 3.3. The set of abstraction variables is defined as follows: 

AV = {X[ 4 .„] I t G T{E+,X),t{e) G A+\A} 

where AV is a set of new variables disjoint from X. Let A+ be A U AX> . The 
mapping tt, that associates to each term t the variable X[tiT ] ^ AV, uniquely 
extends to a T'-homomorphism from E{E^ , X) to E(E, T+) defined as follows: 
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• = if/er, 

• fih, tnY = X[/(ti,....t„)4.Tj,] if / G 

• = X li X € X. 

The image of a term t by this I7-homomorphism is called the tt- abstraction of 
t. The set of abstraction variables in t^ is AV{f^) = Var(t’^). Similarly, the 

TT -abstraction ip'^ of an equational formula ip is defined by replacing the terms 
occurring in ip by their 7r-abstractions. The set of abstraction variables occurring 
in (p^ is AV{ip'^) = AV n Var{ip'^). The 7r-instantiation of variables in AV((p'^) 
is the unique substitution such that = {X[t 4 ,j,^] -)• t iTR}x[uj,^]eAV(v^)- 
An alien subterm of t is a A'+\A'-rooted term such that all superterms are S- 
rooted terms. AlienPos{t) denotes the set of positions of alien subterms in t. 

We are now ready to state the result relating rg-equational proofs and E- 
equational proofs. 

Proposition 3.4. For any t,u G T{E~^ , X), we have 

t =Te u 4^ E =e vA ■ 

Proof. The (<;=) direction is obvious since for any term t, we have t =Te 
Consider now the (=^«) direction. 

— If t and u are A-rooted terms, t =Te implies that t 4,Th= u 4-Th and so, by 
Lemma lA . II (Section El, we have 

P >*R {t ItrY = {u ItrY < R 

— If t and u are (A+\A')-rooted terms, then t =Te is equivalent to P = 
by definition of the 7r-abstraction. 

— If t is A-rooted, then its normal form is a A-rooted term by Lemma lA.ll 
If u is (A+\A')-rooted, then its normal form is a (A+\A')-rooted term by 
Lemma ro and Lemma 1701 If u is a variable, then u is in normal form. 
Therefore, t and u cannot be T^-equal if u is (I7+\A')-rooted or a variable. 
Moreover, in both cases, is a variable which cannot be A-equal to the 
A-term P, otherwise it contradicts the fact that E is collapse- free. 

Note that P and are simply identical if t and u are (A+\Z')-rooted terms. 
This result has two main consequences: 

— First, an A-matching algorithm can be reused without loss of completeness 
for solving T^-matching problems involving only H-pure terms, and more 
generally left-pure A-matching problems (see Section^. 

— Second, each alien subterm occurring in t is Tgj-equal to an alien subterm 
of u since if is a regular and collapse-free equational theory. This explains 
why we can reuse a purification rule (Purif, see Section El borrowed from a 
combined matching algorithm in a disjoint union of regular and collapse-free 
theories |E]. 
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4 A Combined Matching Algorithm 

To solve a Tj;-matching problem, the idea is to proceed in a modular way by 
splitting the input problem into conjunctions of two separate subproblems. The 
first subproblem consists of match-equations with non-ground H-terms, and it 
will be solved using an if-matching algorithm. The second subproblem containing 
the rest of match-equations will be handled by a rule-based matching algorithm 
which takes into account the axioms of T. Let us first detail the two classes of 
subproblems we are interested in. 

Definition 4.1. A Te-matching problem (/? is a left-pure E-matching problem if 
left-hand sides of match-equations in tp are A-terms and for each match-equation, 
the right-hand side is A-rooted if the left-hand side is A-rooted. A Tg-matching 
problem 7 is a 01-matching problem if right-hand sides of match-equations in 7 
are in {0, 1}. 

Assumption 2 Left-hand sides and right-hand sides of Te - matching problems 
are normalized with respect to Te- 

A rule-based algorithm for reducing Tg-matching problem into conjunctions 
of a left-pure A-matching problem and a 01-matching problem, called LeftPurif, 
is defined by the repeated application of rules given below, where Purif must 
be applied in all possible ways {don’t know nondeterminism), whilst Dec and 
Clash are applied in one possible way {don’t care nondeterminism). 



Dec : (p A f{pi, . . . ,p„) <' /(ti, . . . , C) »-»■ 3* : A pi < ' ti A • • • A <' 

if / € {^,4 

Clash 3x : (fi A p t Fail 

if (t(e) € {h,i}uC,p^ X,p{e) yf t{e)) or {t{e) € E,p ^ X,p{e) ^ E) 

EqualC 3x ■. ip A c <f c»-^3x : p 
if c G C 



Purif 3x : ip A p <' t 

H-» 

V {(,,, W')\WG AlienPos{p) ^ ^ ^ v] <U A A v fp 

uj' G AlienPos{t)} 
if t{e),p{e) G E , AlienPos{p) / 0 
where v is fresh. 



Fig. 1. LeftPurif Rules 
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The correctness of LeftPurif is stated by the following proposition: 

Proposition 4.2. LeftPurif applied on a Tg-matching problem ip terminates 
and leads to a finite set of normal forms, such that the normal forms different 
from Fail are all the elements of 



iei 



such that 

— is a left-pure if-matchingproblem, for i £ I. 

— is a 01-matching probleirQ, for i £ I. 

— The union 

U U 

creCSUEiVi) <f>^CSUTE{li<^''^v^) 

is a CSUteW)- 

Remark 4.3. A 01-matching problem remains a 01-matching problem after in- 
stantiation of its variables. Hence, the matching problem in Proposi- 

tion lO is a 01-matching problem. 

Therefore, by using an A-matching algorithm, we are able to reduce arbi- 
trary Tg-matching problems into T^-matching problems where right-hand sides 
are in {0,1}. These matching problems will be solved by applying repeatedly 
transformation rules in Figure Hand in Figure 0 

Proposition 4.4. MatchOl applied on a 01-matching problem 7 terminates 
and leads to a finite set of solved forms, such that the solved forms different 
from Fail are all the elements of a CSUte{i)- 

Proof. The proof is divided in three parts: 

— Transformation rules in MatchOl are correct and complete. 

Rules Equal, Diff, RepM are known to be correct and complete in any 
equational theory. Correctness of rules Match-* can be directly proved by 
applying axioms of Tg. The completeness of remaining rules can be shown 
using Lemma, IA. 61 a, nd Lemma given in Section 

— Solved forms are indeed normal forms with respect to the transformation 
rules in MatchOl. 

In j 1 1 )) . we prove that a transformation rule can be applied on any 01- 
matching problem (with T^-normalized left-hand sides) which is not in 
solved form. 

— The repeated application of rules in MatchOl always terminates. 

We can use as complexity measure a lexicographic combination of three 
noetherian orderings: 



^ excluding left-pure E-match-equations x t oi ip, where x £ X,t £ (0, 1}. 
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Equal : (p A p t n-»- : (p 

if p G T{S'^) and p = t 

Diff 3x : ifi A p Fail 

if p G T{S'^) and p yf t 

RepM 3x : p A a: f H-» 3i : p{x ^ i} Ax t 

if ® G V(p) 

ClashE 3x : ifi A p Fail 

if p(e) G S and t G {0, 1} 

FailO 3x : p A i{u) 0 »-»■ Fail 

Faill 3i : p A h{s, t) 1 »-»■ Fail 

Fail-h 3x : p A h{s, f) 0 »-»■ Fail 

if s ^ A" U 1T(C, X),t^ XU 1T{C, X) 



Fig. 2. MatchOl Rules 



1. The first component is NME('j), the number of match-equations in 7. 

2. The second component is -'S'F(7) = V(7)\5'F(7), where S'F(7) is the 
set of solved variables in 7. 

3. The third component is MSL{'y), the multiset of sizes of left-hand sides 
of match-equations in 7. 



Rules 


NME{-f) 


-5F(7) 


MSL{j) 


Equal 


i 






Diff 


i 






RepM 


= 






ClashE 


i 






FailO 


i 






Faill 


i 






Fail-h 


i 






Match-hxy(l) 


i 






Match-hxy(2-3) 


= 


4 = 


1 


Match-hxi/hix(l) 








Match-hxi/hix(2-3) 


= 


1 = 


i 


Mat ch- ht i/ hit 


= 


= 


i 


Match-htx/hxt 


= 


1 = 


i 


Mat ch- hivic/hiciv 


= 


1 = 


i 


Mat ch- hi viv ( 1 ) 








Mat ch- hi viv ( 2-3 ) 


= 


= 


1 


Match-i 


4 
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Match- hxy 



: (p A h{x, J/) <’ 0 »-»■ 



3z, X 



( <p{x i-A i{z),yi-^ i{z)} 
A X =' i(z) Ay =' i{z)) 



V 3i : v? A a; <’ 1 
y 3x : ip A y 1 



\f x,y, z G X 
where « is fresh. 



Match- hxi 



3x : tp{x 1 -^ i(vc)} Ax i{vc) 
3x ■. p A h{x, i(vc)) <’ 0 H-» V 3i : </9 A a; <’ 1 

V 3x : (fi A i(vc) <’ 1 

\f X € X , VC € X U C 



Match- hix 



3x : tp{x 1 -^ i(vc)} Ax i{vc) 
3x ■. p A h{i{vc),x) <’ Oh-» V 3i : </9 A a; <’ 1 

y 3x : (p A i(vc) <’ 1 

X ^ X, VC € X UC 



Match-hti 3x : p A h{t, i(vc)) <’ 0 h-» 3ai : </9 A i(vc) <’ 1 
if t i XU1T{C,X) 

Match-hit 3x : p A h(i(vc),t) <’ 0 h-» 3x : </9 A i{vc) <’ 1 
if t i XU1T{C,X) 

Match-htx 3x : p A h{t, a;) 0 »-»■ 3a; : (p A a; 1 

if a; € A’,t ^ XulT{C,X) 

Match-hxt 3a; : A h{x, t) 0»-^3x : p A x 1 

\f x€X,t^ XU1T{C,X) 



Match-hivic 3a; : p A h{i{v),i{c)) <^0h-»- 
if II G A", c G C 



3a; : A w <'^ c 

y 3x : p A i(v) <’ 



1 



Match-hiciv 3a; : p A h(i(c), i(v)) On-»- 

if u G A", c G C 



3a; : 95 A w < ■ c 
V 3a; : A i(v) 1 



3x 

Match-hiviv 3x : p A h(i(v),i(v')) <’ 0 h-» v 3a; 

V 3a; 

if v,v' £ X 



p{v' v} Av v' 
p A i{v) <’ 1 
p A i{v') 1 



Match- i 



3i : A i{v) 1 ^ (35, * : ^ ^ 

if V G A" 

where z = zi, . . . , Zar(f) are distinct fresh variables. 



Fig. 3. MatchOl Rules (continued) 
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Proposition 4.5. Let E he & regular and collapse-free equational theory such 
that E can be represented by a confluent and terminating TRS. Given an E- 
matching algorithm, it is possible to construct a T^-matching algorithm. 
Proposition 4.6. General Tg-matching is undecidable (resp. inflnitary) if E- 
uniflcation is undecidable (resp. inflnitary). 

Proof. Any T£:-equation s t is equivalent to the general Tg-match-equation 
h{i{g{s)),i{g{t))) 0, where g is an additional free unary function symbol. 



5 Matching versus General Matching 



5.1 A Theory with Finitary Matching and Inflnitary General 
Matching 



Gonsider the equational theory Ta- 



Ta 



h{i{x),i{x)) 
h(l, x) 
h{x, 1) 
i{h{x,y)) 

< i{i{x)) 

i{0) 

z{l) 

i{x + y) 

^x + {y + z) 



0 

0 

0 

1 

1 

1 

1 

1 

{x + y) + z 



Theorem 5.1. There exists an equational theory for which matching is finitary 
and general matching is inflnitary. 

Proof. An A-matching algorithm is known. By Proposition 14. bl it follows that 
T^-matching is finitary. Moreover, A-uniflcation is inflnitary. According to 
Proposition 14.61 general T^-matching is inflnitary. 



5.2 A Theory with Decidable Matching and Undecidable General 
Matching 



Gonsider the equational theory 



Tda = < 



Tda- 




h{i{x) ,i{x)) 


= 0 


h{l, x) 


= 0 


h\x, 1) 


= 0 


i{Hx,y)) 


= 1 


i{i{x)) 


= 1 


m 


= 1 


t(l) 


= 1 


i{x + y) 


= 1 


i{x * y) 


= 1 


x+{y + z) 


= {x + y) + z 


X * {y + z) 


= x*y + x*z 


{y + z) * X 


=y*x+z*x 
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Theorem 5.2. There exists an equational theory for which matching is decid- 
able and general matching is undecidahle. 

Proof. A Z?A-matching algorithm is known. By Proposition 14.51 it follows that 
Tp^-matching is decidable. Moreover, Z?A-unification is undecidable. According 
to Proposition 14 .HI general To^-matching is undecidable. 



6 Conclusion 

One major problem in the context of combining decision/solving algorithms for 
the union of theories is to show that a decision/solving algorithm cannot be 
always extended in order to allow additional free function symbols. In this di- 
rection, we point out the existence of a gap between matching and matching 
with additional free function symbols. This new result is obtained thanks to an 
equational theory Te defined as a non-disjoint union of theories involving a given 
equational theory T with non-linear and non-regular axioms plus an arbitrary 
regular and collapse-free equational theory E, with axioms like A (Associativ- 
ity) and D (Distributivity) . On the one hand, we present a combined matching 
algorithm for Tg. This matching algorithm requires an A-matching algorithm. 
On the other hand, we show that general Tg-matching is undecidable (resp. in- 
finitary) if A-unification is undecidable (resp. infinitary). Then, to end the proof 
of our main result, it is sufficient to remark that there exist regular and collapse- 
free theories E having an A-matching algorithm but for which A-unification is 
either infinitary or undecidable. Our result suggests that the same separation 
between unification without and with free function symbols should hold. In fact, 
for the order-sorted case where we introduce a sort sg for Tg and a subsort sg 
for E, a separation can be established. In this order-sorted framework, any uni- 
fication problem of sort sg can be reduced to an equivalent matching problem. 
Therefore, the result stated for matching can be applied to prove the existence 
of an order-sorted equational theory for which solving unification problems of a 
given sort becomes undecidable when free function symbols are considered m- 
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A Equational Properties of Te 

In this appendix, one can find the lemmas needed to state the results expressed 
by Proposition 1.3.41 a.nd Proposition 14.41 

Lemma A.l. For any H-rooted term t, we have 

— >*R {t ItrV 

where t ].Tr is a Fl-rooted term. 

Proof. Assume there exists t' such that t — >r t' . Since E is collapse- free, t' is 
necessarily a Fl-rooted term. 

If the rewrite rule from R is applied at a position greater than (or equal 
to) a position in AlienPosft), then = E since a one-step rewriting of a 
(A+\Z')-term with a rewrite rule of R does not change its top-symbol. 

Otherwise, the same rewrite rule can be applied on E and we get E — >r . 

Then, the result directly follows from an induction on the length of the rewrite 
derivation. 

Lemma A. 2. For any terms s,t G T(L'+, A), 

h{s, t) Itr= h{s ItrA 4,Th) V h{s, t) Itr= 0 

Proof. A rewrite step is always applied at a position different from e, except if 
it is the last step. This last step leads to the normal form 0 if the top-symbol is 
h and a rule is applicable at the top position. 



290 C. Ringeissen 



Lemma A. 3. For any term t G T{S^ ,X), 

i{t) iTR= i{t Itr) V i{t) ]tTr= 1 

Proof. Similar to the proof of Tjemma, lA.2l The normal form is 1 if the top-symbol 
is i and a rule is applicable at the top position. 

Lemma A. 4. For any term t G IT{C,X), t is in normal form wrt. Tr, and 
there is no other term TE-equal to t. 

Proof. Let t G TT{C,X). It is easy to check that no rule in Tr can be applied 
on t. Furthermore, t cannot be equal to another different term as shown next: 

— First, it cannot be equal to 0 and 1 which are in normal form. 

— It cannot be equal to a term with h as top-symbol, by Lemma |A.2L 

— It cannot be equal to another term in I'TIC, A), since all these terms are in 
normal form. 

— It cannot be equal to a term in IT , X)\IT{C , X), since the normal form 
of the latter is necessarily 1. 

— It cannot be equal to a H-rooted term, by Proposition 13.41 Otherwise, it 
contradicts the fact that E is collapse-free. 



Lemma A. 5. For any term u G T(E~^, X)\{1}, 

u=Tr l^uG IT{E+, X)\ir{C, X) 

Proof. By applying Lemma El Lemma El and the fact that normal forms 
of if-rooted terms are if-rooted, 1 can only be the normal form of a term in 
IT{E~^ , X) (or the normal form of itself). Then, we can remark that terms in 
IT{E'^ , X)\IT{C, X) can be reduced to 1. 

Lemma A. 6 . For any term u G T(2f+, A)\{0}, 



u =Te 0 3s, t : u = h{s, t)A 



s,t & ^T{C, X) A s = t 
V (-(s, t G IT(C, X)) A (s =T« 1 V t =Te 1)) 



Proof. By applying Lemma I a. 21 Lemma El and the fact that normal forms of 
27-rooted terms are H-rooted, 0 can only be the normal form of a term u with 
h as top-symbol, or the normal form of 0. Since u yf 0, there exist s, t such that 
u = h{s, t), and there is necessarily a rewrite proof of the form 



u = h{s,t) — >Tr Hs ItrT 4-Th) -Atr 0 
Let us consider the last rewrite step: 

— If the applied rule is h{i{x),i{x)) -A 0, then s ItrT Itr& IT{C,X) and 
s 4-Th= t Itr- By Lemma [a. 41 we have s = s I.Tr= t 4-Tr= t. 

— If the applied rule is h{l, i{x)) -A 0, then s irR= 1- By Ijemma fA.5l we have 

sGiTiE+,x)\ir{c,x). 

— In a symmetric way, if the applied rule is h{i{x), 1) -A 0, then t 4-Tk= 1- By 
Lemma El we have t G IT{E+ , X)\IT{C , X). 
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Abstract. We propose a new method for deriving focused ordered 
resolution calculi, exemplified by chaining calculi for transitive relations. 
Previously, inference rules were postulated and a posteriori verified in 
semantic completeness proofs. We derive them from the theory axioms. 
Gompleteness of our calculi then follows from correctness of this syn- 
thesis. Our method clearly separates deductive and procedural aspects: 
relating ordered chaining to Knuth-Bendix completion for transitive 
relations provides the semantic background that drives the synthesis 
towards its goal. This yields a more restrictive and transparent chaining 
calculus. The method also supports the development of approximate 
focused calculi and a modular approach to theory hierarchies. 



1 Introduction 

The integration of theory-specific knowledge and the systematic development 
of focused calculi are indispensable for applying logic in computer science. Fo- 
cused first-order reasoning about equations, transitive relations and orderings 
has been successfully achieved, for instance, with ordered chaining and superpo- 
sition calculi via integration of term rewriting techniques. In presence of more 
mathematical structure, however, the standard techniques do not sufficiently 
support a systematic development: Inference rules must be postulated ad hoc 
and a posteriori justified in rather involved semantic completeness proofs. 

We therefore propose an alternative solution: a method for systematically 
developing focused ordered resolution calculi by deriving their inference rules. 
We exemplify the method by a simple case, deriving ordered chaining calculi 
for transitive relations from the ordered resolution calculus with the transitivity 
axiom. More complex examples, including derived chaining and tableau calculi 
for various lattices, can be found in El- 

Chaining calculi !iil.ai.U.KJZ| are instances of theory resolution. The transi- 
tivity axiom x < y < z — > a: < z is replaced by a chaining rule 

r — >• A,x <y r' — A' , y < z 
r,r' — > A, A',x<z 



that is a derived rule 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 231- nTO 2001. 
© Springer- Verlag Berlin Heidelberg 2001 
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r — > A,x < y X < y < z — > x < z 

r,y < z — > A,x < z r' — > A' , y < z 

r,r' — > A, A',x<z 

of the resolution calculus internalizing the transitivity axiom. Chaining prunes 
the search space, in particular in its ordered variant |2ld| . where inferences are 
constrained by syntactic orderings on terms, atoms and clauses. Completeness 
of ordered chaining is usually proved by adapting the model construction of 
equational superposition 0. This strongly couples the deductive process and 
the syntactic ordering. It leads to certain unavoidable inference rules of opaque 
procedural content. 

Our alternative proof-theoretic method is based on a different idea: Retrans- 
late a refutational derivation in the ordered chaining or any superposition cal- 
culus to a refutational derivation in the ordered resolution calculus with theory 
axioms. Then a repeating pattern of non-theory clauses successively resolving 
with one instance of an axiom appears. Theory axioms are therefore indepen- 
dent: their interaction is not needed in refutational derivations. So why not 
using the converse of this observation and derive rule-based calculi from ordered 
resolution instead of postulating them? In a first step, make theory axioms inde- 
pendent. In a second step, search for patterns in refutational ordered resolution 
derivations with the independent axioms and turn them into derived inference 
rules internalizing the theory axioms. 

It turns out that the closures of the theory axioms under ordered resolution 
modulo redundancy elimination have exactly the properties required for inde- 
pendent sets. So the first step of our method is the construction of this closure. 
In the second step, the inference rules are derived from the interaction of non- 
theory clauses with the closure, constrained by refined syntactic orderings that 
encode the goal-specific information about the construction. Completeness of the 
calculi thus reduces to correctness of their derivation. Deductive and procedural 
aspects are now clearly separated. In the present example, the transitivity axiom 
is already closed. The derived chaining rules are more restrictive and transpar- 
ent than previous ones. Also the completeness proof is conceptually clear and 
simple. Chaining rules and their ordering constraints can be naturally moti- 
vated by Knuth-Bendix procedures for non-symmetric transitive relations ra- 
in particular, a restriction of the calculus is such a Knuth-Bendix procedure. 
Briefly, the main contributions of this text are the following: 

— We propose a new method for deriving focused calculi based on ordered 
resolution. 

— We use our method for syntactic completeness proofs for superposition cal- 
culi, in which the deductive and the procedural content are clearly separated. 

— We derive a more restrictive and transparent ordered chaining calculus for 
transitive relations. 

— We propose our method for approximating focused calculi and modular con- 
structions with hierarchical theories. 

Here, some proofs can only be sketched. All details can be found in [ 1 41 1 ;-i^ . 
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The remainder of the text is organized as follows. Section 0 introduces basic 
definitions and the ordered resolution calculus. Section 0 introduces the syntac- 
tic orderings for the chaining calculus, section 0 the calculus itself. Section 0 
contains the first step of the derivation of the chaining rules: the computation 
of the resolution basis. Section 0 contains the second step: the derivation of the 
inference rules. Section 0 discusses several specializations of the calculus and 
its relation to the set-of-support strategy and to Knuth-Bendix completion for 
non-symmetric rewriting. Section El contains a conclusion. 

2 Preliminaries 

Let Tjj{X) be a set of terms with signature S and variables in X. Let the 
language also contain one single binary relation symbol <. The set A of atoms 
then consists of expressions ti < t 2 over terms X clause is an expression 

HV'i,..., '!/'«!}• 

Its antecedent ^(pi, , (j)rn^ and succedent {|'0i, ■ . ■ , V’nd' are finite multisets of 
atoms. Antecedents are schematically denoted by T, succedents by A. Brackets 
will usually be omitted. The above clause represents the closed universal formula 



(Vxi . . . Xfc) (-1(^1 V • • • V V 'i/^i V • • • V . 



A Horn clause contains at most one atom in its succedent. 

We assume that < denotes a relation satisfying the transitivity axiom (trans) 

X < y,y < z — > X < z 

for all x,y, z G X. We will however not build in monotonicity axioms 
x<y — > f{...x...) <f{...y...) 

for functions f G X, for reasons that will be explained in section 0 We write 



xi < X 2 < X 3 < ■ ■ ■ < Xk-i < Xn instead of xi < X 2 , X 2 < X 3 , . . . , Xk-i < x„- 
Definition 1 (Ordered Resolution Calculus). Given a well-founded order- 
ing -< on atoms that is total on ground terms, the ordered resolution calculus 
OR consists of the following deduction inference rules. 



r — > A,(j) r',if 



A' 



Fa, r'a 



Aa, A' 



(Ordered Resolution) 



r^A,cf,i; 

r a — > Aa, tpa ’ 



(Ordered Factoring) 



where a is a most general unifier of (f) and ip. In the Ordered Resolution rule, 
4 >a is strictly maximal according to -< in the a -instance of the first and maximal 
in that of the second premise. In the Ordered Factoring rule, (j>a is maximal in 
the a-instance of the premise. 
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In all inference rules, side formulas are the parts of clauses denoted by capi- 
tal Greek letters. Atoms occurring explicitly in the premises are called minor 
formulas, those in the conclusion principal formulas. 

Let S' be a clause set. We define the closures cl|=(S), denoting the set of 
(semantic clausal) consequences of S, and c1or(S), denoting the set of clauses 
derivable in OR from S. A clause C is ^-redundant or simply redundant in S, 
if C S cl|^(C'i , . . . ,Ck) for some S 9 Ci, . . . , Cfc ^ C. Elimination of redundant 
clauses preserves the set of consequences; it is usually eagerly applied in OR. 
We denote this (non-monotonic) operation of OR-closure modulo ^-redundancy 
elimination by cIqr'^. Since cl^(S') = c1^(c1or‘^(<S')), this operation induces a 
basis transformation. It need not terminate, since resolution is only a semi- 
decision procedure. However, fair OR-strategies are refutationally complete: for 
all finite inconsistent clause sets, they derive the empty clause within finitely 
many steps. 

Proposition 1. If S is inconsistent, then {S) contains the empty clause. 

When S is consistent and cIqr '^- closed, it satisfies — in addition to being a basis — 
also an independence property, since by definition, all conclusions of primary S- 
inferences, that is OR-inferences with both premises from S, are redundant. 

Proposition 2 (P)- Let S be consistent and closed under cIor"^. Let SUT be 
inconsistent for some clause set T. Then there exists a refutational OR-derivation 
without primary S-inferences. 

Thus cI^^r'^ transforms a consistent clause set to an independent or irredundant 
basis which is also irreducible by OR-inferences: a resolution basis. By propo- 
sition 0 resolution bases allow resolution strategies similar to set of support, 
which is complete in the unordered case for arbitrary consistent clause sets. In 
the ordered case, cloR-closure is necessary for dispensing with primary theory 
inferences. The computation of the resolution basis shifts part of proof-search 
complexity from run time to compile time in a well-defined way. It constitutes 
the first step of our derivation of focused ordered resolution calculi. If the proce- 
dure does not terminate, at least finite approximations of resolution bases may 
be used for constructing efficient incomplete calculi. Moreover, a stratified or 
incremental computation of resolution bases for hierarchic theories is possible. 

3 Syntactic Orderings for Ordered Chaining 

The computation of a resolution basis, in particular its termination, crucially 
depends on the appropriate syntactic ordering ^ on terms, atoms and clauses. 
A reasonable approach is to select the ordering in accordance with well-known 
decision procedures. We follow j2j and use the ordering for a variant of Knuth- 
Bendix completion for transitive relations. Atom orderings suffice to constrain 
the inferences of OR. Clause orderings are needed to handle redundancy. 

Let ^ be a total well-founded ordering on ground terms in Ts. Let B be the 
two-element boolean algebra with ordering <r. Let M = Ts x B x B x Tjj. Let 
A be a set of atoms occurring in some clause C = L — > A. The ordering ^iC 
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M X M is the lexicographic combination of ^ for the first and last component 
of M and <b for the others. A ground atom measure (for clause C) is the 
mapping fj,c ■ ^ t M defined by /ic : ^ >— t (t,y(^),p((/>), s ((/<), for each 

(ground) atom (f> £ A occurring in C. Hereby {t^{4>)) denotes the maximal 

(minimal) term with respect to ^ in </). p{<f) = 1 {pijf) = 0), if ^ occurs in F 
(in A). s{4>) = 1 (s(</>) = 0), if (/) = s < t and s>t{s<t). The (ground) atom 
ordering -< 2 ^ A x A is defined by ^ ^2 V' iff t^c{4’) ^1 for £ A. 

Hence -<2 is embedded in ^1 via the atom measure, as shown in the diagram 



The ordering is total and well-founded by construction. Via the embedding, 
-<2 inherits these properties. Intuitively, t^, and s yield ordering constraints turn- 
ing the Positive Chaining rule defined below into a clausal extension of a critical 
pair computation, p guides the derivation of inference rules in section El and in 
particular makes (trans) a resolution basis in section O disambiguates atoms 
for which the other components of the atom measure are identical. 

As free variables are implicitly universally quantified, the orderings and 
~<2 are lifted to the non-ground case, defining the ordering -<'C Ts{X) x Ts{X) 
hy s t iff sa -< ta for all ground substitutions a. This criterion need not be 
decidable. Defining and -<'2 is then obvious. These orderings are still well- 
founded, but need no longer be total. The following trivial consequence of lifting 
determines the ordering constraints of the non-ground ordered chaining calculus. 

Lemma 1. s t if ta >- sa for some ground terms sa and ta. 

Atom measure and ordering are extended to clauses, measuring clauses as mul- 
tisets of their atoms and using the multiset extension of the atom orderings. 
The clause ordering on ground clauses inherits totality and well-foundedness 
from the atom ordering. Again, the non-ground extension need not be total. In 
unambiguous situations we will denote all orderings by 

4 The Ordered Chaining Calculus 

We now define the ordered chaining calculus for transitive relations. It is a more 
restrictive variant of |2|. Instead of their Transitivity Resolution rules we use 
Transitivity Factoring rules. We do not intend to build in monotonicity axioms, 
thus we restrict chaining to roots of terms. We only state one of two Negative 
Chaining and Transitivity Factoring rules. A simple inversion of the ordering 
gives the respective other one. 

Definition 2. Let -< be an atom ordering. The ordered chaining calculus for 
transitive relations OC eonsists of the deduction rules of the ordered resolution 



A > M 





A > M 
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calculus OR, its associated -<-redundancy elimination rule^ and the following 
rules, where a is a most general unifier of s and s' , scr ra and, when it 
occurs, s' a ta. 



r 



A,r<s r' — >A',s'<t 



r a, r'a — > Act, A' a, ra < ta 



(Positive Chaining) 



where the a -instances of the minor formulas is strictly greater than the instances 
of the respective side formulas. 



r,s<t 



A 



r' A', s' <r 



r a, r'a, ra < ta 



Aa, A'. 



(Negative Chaining) 



where the a -instance of the first (seeond) minor formulas is (strictly) greater 
than the a -instance of the respective side formula and sa ^ ta. 



r 



A, s < r, s' < r' 



Fa,ra < r'a — > Aa, sa < r'a 



(Transitivity Factoring) 



where the a-instance of the leftmost minor formula is strictly greater than the 
a-instance of the premise. In addition, a second Negative Chaining rule and a 
second Transitivity Factoring rule are defined by inversion of <. 

Soundness and completeness of OC are the subject of section 0 and m The 
meaning of the rules and their ordering constraints is briefly discussed section 0 
The unordered variant of OC is an instance of theory resolution and therefore 
more focused than mere set of support with (trans). We will derive the OC-rules 
as an ordered variant of theory resolution. Again, they will be more focused than 
mere reasoning with resolution bases. But as a first step in the completeness 
proof, we must derive a resolution basis from (trans) according to lemma El This 
construction is the subject of the following section. 



5 Constructing the Resolution Basis 

We now perform the first step of the derivation of OC. Our theory is (trans). 
With the orderings of section 0 we compute its resolution basis; its OR-closure 
modulo redundancy elimination. Here, (trans) is the resolution basis. 

Lemma 2. For the atom ordering -<, (trans) is a resolution basis of the theory 
of the transitive relation <. 

Proof. Order an arbitrary ground instance a < b < c — > a < c oi (trans) by the 
possible term orderings -<. Let A = a<b, B = b<c and C = a < c. There are 
six different orderings ^ for a, b, c. We consider them as three different cases. 

(case i) If a is maximal, then t,^{A) = ty(C) t,y{B) and p{A) >b p{C). 
Hence A>- C >- B. 



^ Section El only defines a semantic notion of redundancy. Every set of inference rules 
implementing this notion is admitted. 
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(case ii) If b is maximal, then tu{A) = t^{B) >- t^{C), p{A) = p{B) and 
s{B) >- s(7l). Hence B >- Ay C. 

(case iii) If c is maximal, then t,j{B) = t^{C) >~ t^{A) and p{B) >b p{C). 
Hence B y C y A. 

Thus for all atom orderings ^ and ground instances of (trans), the antecedent 
is greater than the succedent. By definition of non-ground atom and clause or- 
derings, the result can immediately be lifted. Thus there are no primary theory 
inferences of (trans). Factoring is inapplicable, since (trans) is a Horn clause. 
Concluding, (trans) is a resolution basis for the theory of transitive relations. □ 

Proposition Q and lemma 0 immediately imply the following facts, which are 
essential for the arguments in the following section. 

Corollary 1. (i) For every inconsistent clause set containing (trans) there ex- 
ists a refutational proof without primary theory resolution inferences. 

(ii) For every term ordering, every refutational proof using (trans) is free of 
primary theory resolution inferences. 

(ii) strengthens (i). It holds in this particular case, because primary theory infer- 
ences are not only redundant, but a priori prohibited by the ordering constraints. 
Lemma|2|and corollary 0 express a trivial resolution basis computation, thereby 
establishing an ordered counterpart of set of support with (trans) . The following 
section derives inference rules from the resolution basis in the spirit of theory 
resolution. In sectional we will then compare the performance of OC with that 
of resolution basis proofs and a further intermediate system. 

6 Deriving the Chaining Rnles 

We now derive the inference rules of OC from OR-derivations with (trans). Our 
main assumptions are refutational completeness of OC (theorem 0) and the fact 
that our ordering constraints rule out primary theory inferences of (trans) (corol- 
lary 0. The derivation proceeds in several steps. First, we show that certain 
permutations of inferences, which introduce inferences that violate the ordering 
constraints, have no critical impact on the structure of refutational proofs. Sec- 
ond, we show that corollary 0 can be further strengthened to the OR-rules as 
macro inferences consisting of two consecutive resolution inferences in which two 
non-theory clauses and one instance of (trans) participate. The combination of 
these two properties yields a partition of arbitrary refutational OR derivations 
into pattern corresponding to the OC-rules. Since all instances of (trans) appears 
within such macro inferences, they can be internalized. This generates the rules 
of the ordered chaining calculus. We first consider the ground case, where the 
constraints on inference rules are simplified. 

The Negative Chaining rule, for instance, becomes 

F — > A,a < b F' , a < c — A' 

F,F',b<c — > A, A' ’ ^ ’ 

where a is the strictly maximal term in both premises and the minor formulas 
are maximal in the first and strictly maximal in the second clause. In general. 
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by lemma ^ ordering constraints of the form t ^ s are replaced by those of the 
form t ^ s. (0 can be derived in OR as 

r — s>Z\,[^<6 [^ < 6 < c — >a<c 

r,h < c — s> Z\, < c 0 < ^ ^ 

r,r,b<c^ A, A' 

We put maximal terms in boxes, where they occur in minor formulas. There 
are two obstacles arising for the derivation of the Negative Chaining rule in ([[J . 
First, it may happen that F',a < c — > A' is a second instance of (trans), 
a < c < d — > a < d, say. We call such a situation a secondary theory inference. 
Second, it may happen that A contains an atom bigger than a < c. Then the 
second inference in m violates the ordering constraints and a refutational OR- 
derivation using the first inference of m must continue with an inference on 
a bigger atom. We call such a situation a blocking inference. Secondary theory 
inferences are problematic because they prevent us from making (trans) implicit. 
Blocking inference prevent us from deriving instances of chaining rules. We call 
a 0 R-derivation regular, when it contains neither primary and secondary theory 
inferences nor blocking inferences. 

Lemma 3. For every inconsistent clause set there exists a regular refutational 
derivation in OR (possibly violating the ordering constraints). 

Proof. We only give a sketch. A concise proof can be found in I14I13I . We proceed 
in three steps. We show that a refutational derivations exist without (i) secondary 
theory inferences, (ii) blocking inferences and (iii) both kinds of inferences. 

(ad i) By induction on the size of clauses. We inspect secondary theory 
inferences in a refutational 0 R-derivation. Consider again, for instance, infer- 
ence 0 with right-hand premise a < c < d — > a < d and conclusion 
F,b < c < d — >■ A,a < d. We replace this derivation by the inference 

F — >■ A, [o^ < b < b < d — > a < d 

F,b < d — >■ A,a < d ’ 

with a < b < d — >■ a < d. The conclusions of 0 and @ only differ at 
the expressions b < c < d and b < d in the antecedents. We argue that every 
refutational derivation with Q can be replaced by a refutational derivation with 
©• Every refutational derivation with 0 must contain clauses F' — ^ A' ,b < c 
and F” — > A” , c < d eliminating the respective atoms from the conclusion. 
These clauses are also available in the derivation with @. There we need only 
smaller instances of (trans) — as required by the ordering constraints on b, c 
and d — to eliminate b < d from the conclusion of (|2I), too. By the induction 
hypothesis, these new instances do not introduce secondary theory inferences. 
Other cases with secondary theory inferences are similar. 

(ad ii) By induction on the size of clauses. Consider, for instance, a derivation 

F — > A, < b,a < c' 0 < b < c — > a < c 

F,b < c — > A,a < c, < c' F" ,((aj < c' — > A" 

F, F" , b < c — A, A" , a < c 
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The second inference is blocking, since it disables a derivation of the Negative 
Chaining rule in (P) with the smaller clause F' ,a < c — Z\'. We permute these 
two inferences and violate the ordering constraints. It can be shown that under 
this permutation we still obtain a refutational proof. Since the side formulas 
r' and A' are small, the change in the derivation remains local in the proof, 
up to some copying of proof trees and factoring inferences. This procedure is 
iterated bottom up on all blocking inferences in a refutational derivation, simply 
disregarding previous violations of the ordering constraints. 

(ad iii) By induction on the size of clauses. Primary theory inferences are 
ruled out ab initio by the ordering constraints. We inspect the proof bottom 
up. We first transform secondary theory inferences (if they exist) up to the 
first blocking inference. This does not introduce new blocking inferences, by the 
induction hypothesis. We then transform the first blocking inference. This per- 
mutation introduces at most one secondary theory inference at the top level. 
We then transform all secondary theory inferences up to the second blocking 
inference. Whenever we copy a proof tree in the transformation, we simulta- 
neously transform all copies. Therefore the procedure terminates after finitely 
many steps for each proof and yields a regular derivation. □ 

We are now prepared for our main theorem. 

Theorem 1. The ground ordered chaining calculus OC is refutationally com- 
plete: For every inconsistent ground clause set containing (trans) there exists a 
refutational derivation in OC. 

Proof. Consider a regular derivation to the empty clause. Such a derivation 
exists by lemma 0 Hence in all inferences either both premises are non-theory 
clauses (with respect to the theory of transitivity) or one premise is a non-theory 
clause and the other an instance of (trans) . The former inferences are handled by 
Ordered Resolution and Ordered Factoring. The latter yield the inference rules 
of the ordered chaining calculus, as the following argument shows. Consider 
the ordered resolution and ordered factoring steps of non-theory clauses with 
(trans). By the proof of lemmaEl for every ordering the antecedent of (trans) 
is greater than the succedent. Therefore all non-theory clauses that are resolved 
with an instance of (trans) have their minor formula in the succedent. We put 
terms at which the chaining takes place in boxes. 

We consider the possible ordering constraints on a ground instance a < b < 
c — > a < c of (trans) in interaction with non-theory clauses. 

(case i) Let a be maximal in the axiom and assume that there is a clause 
F — > A, a < b in which the atom a < b is strictly maximal. So in particular a 
does not occur in F. Then a possible resolution inference is 

F — ^ Z\,^<6 [^<6<c — ^ a < c 

F,b < c — >A,a<c ’ 

In the conclusion, the maximal element a may occur more than once, but only 
in A. There are three subcases to consider. 
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(case i a) a < c is strictly maximal in the conclusion, thus cannot occur in A 
by definition. Assume there exists a non-theory clause A',[^ < c — >■ A' and a 
resolution inference 



r',fg] < c 



A' 



r,b < c — > Z\,fg] < c 



r,r',b<c 



A, A' 



( 4 ) 



thus deriving a ground instance of the first Negative Chaining rule. In infer- 
ence 0, the ordering constraints of ordered resolution requires the minor formula 
of the left-hand premise to be maximal, but not strictly maximaJl. This gives 
the ordering constraints of the ground variant of this Negative Chaining rules. 

(case i b) Under the assumption of (case i a), but the second premise is 
another instance a < c < d — > a < d of (trans) . An inference with this instance 
would be a secondary theory inference contradicting the assumption of regularity. 

(case i c) Under the assumption of (case i a), but the second premise is 
another instance d < a,[^ < c — > d < c of (trans). An inference with this in- 
stance would again be a secondary theory inference contradicting our assumption 
of regularity. 

(case i d) a occurs in A = A,a < d and c = c' . Then the derivation must 
continue with factoring as 

r,b < c — >■ A,a < c,a < c 
r,b < c — >A,a<c ’ 



thus deriving a ground instance of one of the Transitivity Factoring rules. 

(case i e) a occurs in Z\ = A,a < c' and c' >- c. Then the non-theory clause 
in inference 13 has the form F — > A, a < c',a < b with a < d greater than 
a < c. Then the next inference would be a blocking inference contradicting our 
assumption of regularity. 

(case ii) Let b be maximal in the axiom. Then, by (case ii) of the proof of 
lemma 13 the atom & < c is strictly maximal in the axiom and there is a clause 
r — > A, < c with b >- c. Then a possible inference with an instance of the 
transitivity axiom is 

<c a<6,[^<c — > a < c 

r,a < b — A,a < c 



Since moreover 6 a an inference of a non-theory clause with the resolvent of 0 
must have the form 

r' — S>Z\',Q<[j] U,g<[j] — >A,a<c 

r,F' — > A, A',a<c ^ 

such that a ground instance of the Positive Chaining rule is derived. Also, the 
ordering constraints of ordered resolution require both minor formulas in the 
non-theory clauses to be strictly maximal, as for the Positive Chaining rule. 

^ This is because ordered factoring is only defined for succedents. 
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Here the ordering constraints and lemma 0rule out that the second inference 
is a secondary theory inference or a blocking inference. 

(case iii) Let c be maximal in the axiom. This case is analogous to (case 
i) and yields the second Negative Chaining rule. Assume that there is a clause 
r — > A, b < c in which the atom 6 < c is strictly maximal. So in particular c 
does not occur in F and A does not contain an atom of the form c < d. Then a 
possible resolution inference is 

r — ^ A,b a<6<[^ — ^ a < c 

r,a<b — >A,a<c ’ 

In the conclusion, the maximal term a may occur more than once, but only in 
Z\. There are three subcases to consider. 

(case iii a) a < c is strictly maximal in the conclusion and does not occur in 
A. Assume there exists a non-theory clause T',a < [^ — A' and a resolution 
inference 



r',a <[^ — > r,a<b — >A,a<[^ 

r,F',a<c^ A,A' ’ 

thus deriving a ground instance of the second Negative Chaining rule. In in- 
ference El the ordering constraints of ordered resolution again require the mi- 
nor formula of the left-hand premise to be maximal, but not strictly maximal. 
This gives the ordering constraints of the ground variant of the second Negative 
Chaining rule. 

(case iii b) Under the assumption of (case i a), but the second premise is 
another instance d < a < — > d < c of (trans). An inference with this 

instance would be a secondary theory inference contradicting our assumption of 
regularity. 

(case iii c) Under the assumption of (case i a), but the second premise is 
another instance b < c < d — > b < d of (trans). This inference would however 
violate the ordering constraints such that the argument of regularity need not 
be applied, c < d and not 6 < c is strictly maximal in that instance, since 
s(c < d) = 1 >B 0 = s(b < c). 

(case iii d) c occurs in A = A,a' < c and a = a' . Then the proof must 
continue with factoring as 



r,a < b — > A, a < c, a < c 
r,a <b — > A,a < c ’ 



( 10 ) 



thus deriving a ground instance of the second Transitivity Factoring rule. 

(case iii e) a occurs in A = A, a < c' and c' >- c. Then the the non-theory 
clause in inference 0has the form F — > A, a < c' , a < b with a < c' greater 
than a < c. Then the next inference would be a blocking inference contradicting 
our assumption of regularity, in analogy to (case i e) . □ 

Theorem n can be extended to the non-ground case either by lifting or directly 
by considering non-ground non-theory clauses. 
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Theorem 2. The ordered chaining calculus OC is refutationally complete. 
Proof. Revisit the cases of theorem ^ The unification constraints and the or- 
dering constraints on atoms follow from those of the ordered resolution calculus. 
The ordering constraints on terms follow from lifting of ground term orderings, 
as expressed lemma Q 

Soundness of OC is trivial from the completeness proof, since all rules have been 
derived from OR with the transitivity axiom. 

7 Discussion 

Inspection of the proofs of theorem^ and [3 immediately yields an ordered chain- 
ing calculus for Horn clauses, simply omitting Ordered Factoring and Transitivity 
Factoring. In particular, blocking inferences do not occur in Horn derivations. 

The completeness results further specialize to decidability of certain subcases. 
To this end, the clause ordering ^ must be extended to make all inference rules 
monotonic: their conclusions must be smaller than their maximal premises. Then 
closure computations lead to smaller and smaller new clauses, but like with 
Knuth-Bendix completion, this does not yet mean termination. ^ is in general 
transfinite such that for each object greater than some limit ordinal, infinitely 
many smaller objects exist and there is no limit on the number of instances of 
each clause that can be taken. At least the closure computation of OC from 
a finite set of ground non-theory clauses terminates, since no new terms are 
generated by the computation. Therefore the term ordering ^ can be finitely 
enumerated and the maximal clause in the input set is a finite upper bound of 
the clauses in the closure. 

Avoidance of eager generation of fresh variables by primary theory infer- 
ences is a main advantage of OC. Our discussion of secondary theory inferences, 
however, shows that even a set-of-support-like strategy with the resolution basis 
cannot avoid an accumulation of undesired variables. This problem can be cir- 
cumvented by using the resolution basis together with additional bookkeeping to 
avoid secondary theory inferences. But still, every resolution inference between 
a non-theory clause P — >• A,a < b and (trans) leads to intermediate results, 
like r,b < X — > A,a < X, that must be stored, whereas they are left implicit 
in OC. From a theoretical point of view, this increase of proof- length may seem 
insignificant, it also has no impact on the complexity of proof search, but prac- 
tically the ordered chaining calculi seem superior to strategies directly using the 
resolution basis. 

Our derivation method of ordered chaining calculi is a synthetic approach. 
So far, we have derived the inference rules, but not motivated the ordering 
constraints, which include the procedural aspects of the calculi and integrate 
theory-specific knowledge. In fact, these constraints find a natural explanation 
via concepts of rewriting and the Knuth-Bendix procedure for non-symmetric 
transitive relations, as specified in |l 4^^ In a real synthesis situation, the se- 

^ Similar ideas appear in |2]. An approach to rewriting with quasi-orderings has first 
been proposed by |H]. 
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mantic information is needed a priori to encode the desired properties of the 
inference rules via the ordering. Here, a first step in the development process 
would be to model the Positive Chaining rule as an extension of the critical pair 
computation rule of the Knuth-Bendix procedure for transitive relations. The 
ordering can then be further modified for other “meaningful” rules. 

Consider a presentation of a transitive relation < on a set of ground terms 
in presence of a syntactic ordering which is well-founded and total on ground 
terms, atoms and <-chains. The presentation can be partitioned into a decreas- 
ing part i? = (< n ;^), an increasing part 5 = (< fl and the reflexive part 
Z\< of <, which is neither increasing nor decreasing. In analogy to the Church- 
Rosser theorem of equational rewriting one can ask for conditions to replace 
(i?U5')-chains by elements of S* U R* , that is by chains that are monoton- 
ically decreasing from their initial and final elements. The following statement 
generalizes Newman’s lemma to non-symmetric rewriting. 

Lemma 4. Let (i? U S~^) be well-founded. Then SR C R~^ S* U R* S~^ implies 
{RUS)+ C R+S*UR*S+. 

If this replacement property holds, then s < t can be tested by first checking 
the reflexive part of < or else spanning the i?-digraph from s and the inverse 
S'-digraph from t and searching for a common vertex. By well-foundedness of 
both digraphs are acyclic. This test is a decision procedure when both digraphs 
have finite out-degree. 

To enforce this replacement property, the Knuth-Bendix procedure for non- 
symmetric transitive relations iteratively adds critical pairs in SR to an initial 
presentation. In the ground case, this is precisely the role of the restriction 

— > a < b — > b < c 

— > a < c 

of the ground Positive Chaining rule to positive atoms, where b must be strictly 
maximal according to the ordering constraint^. The measure and ordering 
needed for this computation are essentially the atom measure and ordering with- 
out the second component. Therefore the theory of rewriting and completion for 
non-symmetric transitive relations may serve as a starting point for developing 
OC, extending the ordering constraints to the clausal level. In the non-ground 
case, OC extends a variant of ordered Knuth-Bendix completion (c.f. |0| for its 
equational counterpart). 

Non-symmetric rewriting also explains why we did not build in monotonicity, 
that is why chainings have been restricted to roots of terms. In equational rewrit- 
ing, critical pairs arise only when one rewrite rule applies to another rule at a 
position that is not labeled by a variable. In case of transitive relations however, 
also critical pairs corresponding to certain variable positions must be considered. 
Then it is impossible to bound the position in an instance of a variable where 
the rule is applied and these critical pairs cannot be finitely represented within 
first-order logic. Since the ordered Positive Chaining rule generalizes critical pair 

Note that no ground critical pair rule appears for equational completion. 
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computations, monotonicity causes complications in connection with this rule. 
In particular, the complications depend on decidability of second-order context 
unification (c.f. |S|), which to our knowledge is open. But even without mono- 
tonicity, chaining inferences involving an atom s < x or x < t cannot completely 
be avoided, when F and A are non-empty (c.f. 0). 

For a more detailed discussion of these issues and a comparison with the 
model construction and calculi of j2j consider uma. 

8 Conclusion and Related Work 

We proposed a new method for deriving focused calculi based on ordered reso- 
lution. We used our method for syntactic completeness proofs for superposition 
calculi, in which the deductive and the procedural content are clearly separated. 
Here, we derived a more restrictive and transparent ordered chaining calculus for 
transitive relations; applications to various lattice calculi can be found in M 
Our method can also be used for approximating focused calculi and modular 
constructions with hierarchical theories. 

Focusing means integrating theory-specific deductive and procedural knowl- 
edge into inference rules. The rules are derived from an axiomatic representation 
via ordered resolution. This proof-theoretic approach is in contrast to previous 
semantic approaches, where inferences had to be guessed and a posteriori veri- 
fied through model constructions. The derivation takes two steps. First, the the- 
ory axioms are transformed to a resolution basis, which has the property that 
no resolution inferences between its members must be considered in refutation 
derivations. If this transformation does not terminate, one can at least extract 
a resolution basis per hand or use a finite approximation for an incomplete cal- 
culus. Second, the interaction of this resolution basis with non-theory clauses in 
refutational ordered resolution derivations is considered. Again one can either 
extract inference rules of a complete calculus that exhaustively internalizes the 
theory axioms or search for sound but incomplete approximations. 

In case of transitive relations our method leads to a more restrictive and 
transparent calculus and allows a more fine-grained consideration of the relation 
of ordered chaining to ordered resolution, of the ordering constraints and of the 
relation to rewriting techniques. It supports a concise evaluation and comparison 
of the proof search complexity of chaining calculi with related methods. 

For unordered resolution, the set of support strategy has been developed 
in US], a scenario for theory resolution in H2|. Both methods are based on se- 
mantic considerations. Ordered chaining calculi have been proposed in m and 
later without the opaque Transitivity Resolution rule in ^ , based on additional 
coding. Although such coding is beyond our simple and natural approach, for 
which the Transitivity Factoring rule is procedurally transparent, it seems in- 
teresting to reconsider our calculus in this direction in the future. Besides our 
lattice calculi, a consideration of equational theories also appears very interest- 
ing for empirically demonstrating the power and applicability of our method for 
deriving focused theory-specific calculi based on ordered resolution. 
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Abstract. We present the titular proof development which has been 
implemented in Isabelle/HOL. As a first, the proof is conducted exclu- 
sively by the primitive induction principles of the standard syntax and 
the considered reduction relations: the naive way, so to speak. Curiously, 
the Barendregt Variable Convention takes on a central technical role in 
the proof. We also show (i) that our presentation coincides with Curry’s 
and Hindley’s when terms are considered equal up-to a and (ii) that the 
confluence properties of all considered calculi are equivalent. 



1 Introduction 

The A-calculus is a higher-order language: terms can be abstracted over terms. 
It is intended to formalise the concept of a function. The terms of the A-calculus 
are typically generated inductively thus: A™ ::= x \ | 

A A-term, e G A™’’, is hence finite and is either a variable, an application of 
one term to another, or the functional abstraction (aka binding) of a variable 
over a term, respectively. On top of the terms, we define reduction relations, as 
we shall see shortly. Intuitively, we will also want to consider terms that only 
differ in the particular names used to express abstraction to be equal. However, 
this is a slightly tricky construction as far as the algebra of the syntax goes and 
we will only undertake it after mature consideration. 

It is common, informal practice to take the variables to belong to a single 
infinite set of names, VAf, with a decidable equality relation, =, and that is 
indeed what we will do. Recent research I5IT7I has shown that there can be 
formalist advantages to employing a certain amount of ingenuity on the issue 
of variable names. Still, we make a point of following the naive approach. In 
fact, the main contribution of this paper is to show that it is not only possible 
but also feasible and even instructive to use this, the naive set-up, for formal 
purposes. This is relevant both from a foundational and a practical perspective. 

* Supported under EU TMR grant # ERBFMRXCT-980170: LINEAR. Work done 
in part while visiting LEGS, University of Edinburgh from Heriot-Watt University. 
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The latter more-so as we, as a first, give a rational reconstruction of the widely 
used and very helpful Barendregt Variable Convention (BVC) p. 

We stress that T™ is first-order abstract syntax (FOAS) and therefore comes 
equipped with a primitive (first-order) principle of structural induction | 2 |: 

yx.P{x) Vci, e 2 .-P(ei) A P(c 2 ) — ^( 6162 ) Vx, e.P(e) — ?> P(Ax.e) 

Ve.P(e) 

Similarly, the syntax also comes equipped with a primitive recursion principle 
so we can define auxiliary notions (e.g., free variables) by case-splitting. 



The Issues. In the set-up of FOAS defined over one-sorted variable names 
(FOASvAf)i name-overlaps seem inevitable when computing. Traditionally, one 
therefore renames offending binders when appropriate. This has a two-fold neg- 
ative impact: (i) the notion ‘sub-term of’ on which structural induction depends 
is typically brokenQ and (ii) as a term can reduce in different directions, the 
resulting name for a given abstraction cannot be pre-determined. Consider, e.g., 
the following example taken from HH — for precise definitions see Section I I .21 



{\x.{\y.\x.xy)x)y 



ff, 

i^y-^x.xy)y > Xx.xy 

fjC {Xx.Xz.zx)y ---* Xz.zy 



Equational reasoning about FOASvAf can thus seemingly only be conducted up- 
to post-fixed “name-unification” . Aside from any technical problems this might 
pose, the formal properties we establish require some interpretation. 

The basic problems with FOASvAt has directly resulted in the inception of 
syntax formalisms (several of them recent) which overcome the issues by na- 
tive means |4|.'i|til7ISI1 V’j . In general, they mark a conceptual and formal depar- 
ture from the naive qualities of FOASvAt- This is in part unfortunate because 
FOASvAt is the de facto standard in programming language theory where, as a 
result of the problems, it is customary to reason while “assuming the BVC” pi 



“ 2 . 1 . 12 . Terms that are a- [equivalent] are identified.” 

“ 2 . 1 . 13 . If Ml, . . . , M„ occur in a certain mathematical context, [their] 
bound variables are chosen to be different from the free variables.” 
“ 2 . 1 . 14 . Using 2.1.12/13 one can work with A-terms the naive way.” 



Our Contribution. We 

— show that it is possible and feasible to conduct formal equational proofs 
about higher-order languages by simple, first-order means 

— show that this can be done over FOASvAt, as done by hand 

^ Thanks to Regnier for observing that this need not happen with parallel substitution. 
^ We make reference to Barendregt because it is common practice to do so. Many 
other people have imposed hygiene conditions on variables. 
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— formally justify informal practices; in particular, the BVC HEH 

— contribute to a much needed proof theoretical analysis of binding j8K^ 

— introduce a quasi-complete range of positive and negative results about the 
preservation and reflection of confluence under a large class of mappings 



1.1 Terminology and Conventions 

We say that a term reduces to another if they are related by a reduction relation 
and we denote it by an infix arrow. The sub-term a reduction step acts upon is 
called the redex and it is said to be contracted. A reduction relation for which 
a redex remains so when occurring in any sub-term position is said to be con- 
textually closed. We will distinguish raw and real calculi: inductive structures 
vs. the former factored by an equivalence. We use dashed respectively full-lined 
relational symbols for them. The first 5 of the following notions can be given by 
proper inductive constructions. 

— The converse of a relation, — is written 

— Composition is: a — >-i; — >-2 c 3b . a — b A b — >-2 c. 

— Given two reduction relations — and — >- 2 , we have: — )>iu 2 U — >- 2 . 

— Transitive, reflexive closures: (—>■)* — »■ =“def" = U(— >■; -^). 

— Transitive, reflexive, and symmetric closures: =a (— tA U(— ?>a)~^)*- 

— A relation which is functional will be written with a based arrow: i— >■. 

— A term reducing to two terms is called a divergence. 

— Two diverging reduction steps, as defined above, are said to be co-initial. 

— Two reduction steps that share their end-term are said to be co-final. 

— A divergence is resolvable if there exist connecting co-final reduction steps. 

— A relation has the diamond property, o, if any divergence can be resolved. 

— A relation, — >■, is confluent, Confl, if <>(—»•). 

~ Weak confluence is transitive, reflexive resolution of any divergence. 

— An abstract rewrite systems, ARS, is a relation on a set: — >-C A x A. 

— Residuals are the descendants of terms under reduction. 

1.2 Classic Presentations of the A-Calculus 

We will here review Curry’s seminal formalist presentation of the A-calculus |3|. 
We will also review Bindley m as, to the best of our knowledge, he is the first 
to give serious consideration to the problems with names in equational proofs. 
The process of term substitution epitomises the issues. The two A^-terms: Xx.y 
and Xz.y, e.g., have the same intuitive meaning. If we intend to substitute, say, 
the term x in for y, simple syntactic replacement would result in the intuitively 
different terms: Xx.x and Xz.x. Some subtlety is therefore required. 



Curry’s Presentation. Curry @ essentially defines the terms of the A-calculus 
to be A™ with the proviso that variable names are ordered linearly. He defines 
substitution as follows — for free variables, FV(— ), see Section0 Figure E 
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y/ 3 , gx _ / e if X = y 

' \ y otherwise 

(eie 2 )(x := e) = ei(x := e)ei{x := e) 

{ Ay.e' if X = y 

Ay.e'(x := e) if x ^ y A (y ^ FV(e) V x ^ FV(e')) 

\z.e'{y := z)(x := e) o/w; first z ^ {x}UFV(e) UFV(e') 

Curry is seminal in giving a precise definition of substitution which takes into 
account that scoping is static. He then defines the following reduction relations 
for A which are closed contextually: 

— Ay.e(x := y) — Ax.e, if y ^ FV(e) 

— (Ax.e)e' — e(x := e') 

Unfortunately, following on from here, Curry makes no further mentioning of a 
in the proofs of the equational properties of the A-calculus. Instead, all proofs 
are seemingly conducted implicitly on a-equivalence classes although these are 
not formally introduced. 

Hindley’s Presentation. This situation, amongst others, was rectified by 
Bindley HH. In order to address a-equivalence classes explicitly, Bindley in- 
troduced a restricted a-relation which we call a^. The relation is given as the 
contextual closure of: 

— Ax.e — Ay.e(x := y), if x y, y ^ FV(e) U BV(e), and x ^ BV(e) 

The a^-relation has the nice property that the renaming clause of — (— := — ) 
is not invoked, cf. Lemma|^ Furthermore, a number of Hindley’s results conspire 
to establish the following property: 

Lemma 1 (Ftom Lemma 4.7, Lemma 4.8, Corollary 4.8 |llj) 

==qC = = — ^aH = ==qH 

Notation 2 To have an axiomatisation-independent name for a-equivalence 
on A™, we will also refer to the relation of the above lemma as H (read: aleph). 

With this result in place, Hindley undertakes a formal study of a-equivalence 
classes which leads to the definition of a further /3-relation, this time on a^- 
equivalence classes: 

[ejn {e' I e ==„h e'} 

[eijn — >-/3H [62] H 3 e[ G [eijH,62 G [e2jH-e'i -~^pc e'2 

It is this relation which Hindley proves confluent albeit with no formal consid- 
erations concerning the invoked proof principles. This puts Hindley’s treatment 
of the A-calculus firmly apart from the present article. Interestingly, Hindley also 
points out that the obtained (real) confluence result implies confluence of the 
combined a^- and /3®-relation. We are able to formally substantiate this remark 
of Hindley, cf. Theorem El 
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1.3 Related Work 



There has recently been a substantial amount of work on proof principles for 
syntax that seemingly are more advanced than the first-order principles we use 
[I4I5I6I7I8I1 on 2| . These lines of work, in particular the continuations of 0EE2I, 
are all very interesting but orthogonal to the work we present here. We suggest 
that a study of the proof-theoretical strength of these different proof principles 
might be informative and we leave it as potential future work. 

There are a number of formalisations of /3-confluence 1 1 31 1 71 1 iSI22i23j . Apart 
from EH which uses two-sorted variable names to distinguish bound and free 
variables, they all use either higher-order abstract syntax or de Bruijn indexing. 

We know of no formal proof developments, be it for equational properties or 
otherwise, that are based on FOASvAt- That said, Schroer PH does undertake 
a hand-proof of confluence of the A-calculus which explicitly pays attention to 
variable names but with its 700-1- pages, it is perhaps not as approachable as 
could be desired. Besides, no particular attention is paid to the employed proof 
principles and no formalisation is undertaken. 



Acknowledgements. The first author wishes to thank Olivier Danvy, Jean- 
Yves Girard, Stefan Kahrs, Don Sannella, Randy Pollack, and in particular 
Joe Wells for fruitful discussions. The second author wishes to thank James 
Margetson, Larry Paulson, and Markus Wenzel for help and advice on using 
Isabelle/HOL. Finally, both authors wish to thank LFCS and the anonymous 
referees. 



1.4 A Word on Our Proofs 

The Isabelle/HOL proof development underpinning the present article was un- 
dertaken mainly by the second author in the space of roughly 9 weeks. It is 
available from the first author’s homepage. At the time of writing, the conflu- 
ence properties for our A™''-calculus (Section 0 and the A-calculus proper have 
been established. The Isabelle proof development closely follows the presentation 
we give here. There are one or two differences which are exclusively related to 
the use of alternative but equivalent induction principles in certain situations. 

We started from scratch and learned theorem proving and Isabelle as we went 
along. Our proofs are mainly brute-force in that Isabelle apparently had prob- 
lems overcoming the factorial blow-up in search space arising from the heavily 
conditioned proof goals for our conditional rewrite rules. Presently, the size of 
our proof scripts is in the order of 4000 lines of code. 

The second author’s Honours project will contain more detailed information 
about the proof development itself and will focus in part on the automation 
issue. The first author’s thesis will focus more generally on first-order equational 
reasoning about higher-order languages. 



2 Abstract Proof Techniques for Confluence 

We now present the (new and well-known) abstract rewriting methods we use. 
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2.1 Preservation and Reflection of Confluence 



Surprisingly, the results in this section seems to be new. Although they are very 
basic and related to the areas of rewriting modulo and refinement theory, we 
have not found any comprehensive overlaps0 In any event, the presentation is 
novel and instructive for the present purposes. Before proceeding, we refer the 
reader to Appendix 0for an explanation of our diagram notation. 



Definition 3 (Ground ARS Morphism) Assume two ARS: -^a’^ A x A 
and B X B. A mapping, M : A — > B, will be said to be a ground ARS 

morphisnj from -^a to if it is total and onto on points and a homomorphism 
from -^A to — ?>B-' 



->• 



{total)'J^J\4 (onto) JaI Ad J(/iomo)jAI 



An example of a ground ARS morphism is the function that sends an object 
to its equivalence class relative to any equivalence relation (such as, a- or AC- 
equi valence): what one would call a “structural collapse”. Notice that a ground 
ARS morphism prescribes surjectivity on objects but not on relations (and, as 
such, should not be called a “structural collapse” in itself). Instead, the following 
theorem analyses the various “degrees of relational surjectivity” relative to the 
confluence property. 

Theorem 4 Given a ground ARS morphism, M, from —^a to ~^b, we have^ 



1 . 



2 . 



Ml 

• - 

O - 

Ml 



iM 

iM 



• > • 

• — > O 

3- Ml slM 

• > • 

• > • 

I Ml slM 



o(— >^) •/> o(— >- b) 

o(~^a) fA o(— >-_ b) 

o (— >-^) — >• o{^b) 

A o (— >-a) o(~^s) 



Proof The positive results are straightforward to establish. The reflexive(!) 
versions of the following ARS provide counter-examples for all the negative re- 
sults, left-to-right and right-to-left, respectively. Reflexivity is required to estab- 
lish the o property in the first place. 



Or 






(/i ' >6i 

02/ 03.'. . . . A' ■ d 62 



The asymmetry between cases |2I and 0 is due to the functionality of M. 

® A special case of Theorem 00 is reported in M and we contradict a result in (1 flj . 
The name is inspired from |2I1| . 

® In the theorem, the notation 7b (yb) means existence of counter-examples. 
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Implications. In order to preserve confluence under a “structural collapse” 
(i.e., a ground ARS morphism plus a premise from Theorem we see from 
Theorem 0| cases Q and El that it is insufficient to simply prove a raw diamond 
property which admit an initiality condition on well-formedness of raw terms. 
Observe that this is exactly what happens in the wider programming language 
community when using the BVC. 



2.2 Resolving the Abstract Proof Burden of Confluence 

We now sketch the abstract part of the Tait/Martin-L6f proof method for con- 
fluence as formalised by Nipkow m plus what we call Takahashi’s Trick 1^ . 



A Formalisation of the Tait/Martin-L6f Method. The Tait/Martin-L6f 
proof method uses a parallel relation that can contract any number of pre- 
existing redexes in one step, cf. Figure 0 The crucial step in applying the method 
is the following property of ARS. 

Lemma 5 (3 — >-2 . — A o (—>- 2 )) => Confl(— >-i) 

Proof A formalisation is provided in m and is re-used here. □ 

The point is that, since a parallel relation, — >-2 above, can contract an ar- 
bitrary number of redexes in parallel, only one reduction step is required to 
contract the unbounded copies of a particular redex that could have been cre- 
ated through duplication by a preceding reduction. 



Takahashi’s Trick. In order to prove the diamond property of a parallel j3- 
relation, Takahashi m introduced the trick of using an inductively defined com- 
plete development relation, cf. Figure 0 rather than proceed by direct means 
(i.e., an involved case-splitting on the relative locations of redexes). Instead of 
resolving a parallel divergence “minimally” (i.e., by a brute- force case-splitting), 
Takahashi’s idea is to go for “maximal” resolution: the term that has all pre- 
existing redexes contracted in one step is co-flnal for any parallel divergence. Ab- 
stractly, the following ARS result underpins Takahashi’s idea up-to the guarding 
predicates which we have introduced. 



Lemma 6 (Takahashi’s Diamond Diagonalisation (Guarded)) For any 

predicates, P and Q, and any relations, —>a and -^b, we have 



(P). 

O 

Proof Straightforward. 



A 




Si’? 



cv 



(PAQ) 

c**,/ \<? 

• • 

\ / 

<? o C>* 



□ 



The second premise is often called the triangle property when — is functional. 
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y[x : 
(eie2)[® : 
{\y.e')[x : 



= f e if 2 ; = 1/ 

[ y otherwise 
= e] = ei{x := e]e2[a; : 
_ _ / >^y-e-'[x := e] 

\ \y.e' 



= e] 

if X ^ y A y ^ FV(e) 
otherwise 



FV(j/) = {i/} Capt^(y )=0 

FV(eie2) = FV(ei) U FV(e2) Capt^(eie2) = Capt,(ei) U Capt,(e2) 

FV(A,.e) = FV(.) \ b) Cp.,(A,.e) = { “ “ 

Fig. 1 . Total but partially correct substitution, — [— := — ], free variables, FV(— ), and 
variables capturing free occurrences of x, Capt,^(— ), for A™. 



i/^Capt,(e)uFV(e) 
Ax.e Xy.e[x := y] 



FV(e2)nCapt,,(ei) =0 



(Xx.ei)e2 --+/3 ei[a: := 62] 



(/3) 



Fig. 2 . Raw / 3 - and indexed a-contraction — reduction is given by full contextual 
closure. By the premises no invoked substitution will result in free-variable capture. 



BV(a;) = 0 UB(a;) = True 

BV(eie2) = BV(ei) U BV(e2) UB(eie2) = UB(ei) A UB(e2) A BV(ei) n BV(e2) = 0 
BV(Aa^.e) = BV(e) U {x} UB(Ai.e) = UB(e) Ax^ BV(e) 

Fig. 3 . The bound variables and the uniquely bound predicate for the terms of A™. 



3 The A™’’-Calculus 

We will now formally define the A™“'-calculus and go on to show that its “struc- 
tural collapse” under a is the A-calculus proper as defined in Section o 

Definition 7 (The A™'^-Calculus) The terms of the X™ -calculus are A™, 
given on page 1 . Substitution, free variables and capturing variables of raw 
terms are defined in Figure^ The f 3 - and indexed a-rewriting relations of X™ : 

— *13 and — *ia are given inductively hy contextual closure from Figure\^ Plain 
a-rewriting is given as: e\ — *a. 3 z.ci — e-i 

The indexed a-rewriting relation will be used to conduct the ensuing proofs 
but is, as such, not needed for defining the A™’'-calculus. We stress that the 
(inductively defined) reduction relations also come equipped with first-order in- 
duction principles. We will typically refer to uses of these as rule induction. The 
main novelty in the above definition is the side-conditions on the contraction 
rules that makes binder-renaming unnecessary. The construct Capt3,(e) returns 
all the binding variables in e that have a free occurrence of x (relative to e) in 
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their scope. It coincides with the all but forgotten notion of not free for. Sub- 
stitution has been defined the way it has purely to enable us to prove certain 
“renaming sanity” properties for it, which we however will not present here. 

Proposition 8 FV(e 2 ) H Capt 3 ,(ei) = 0 => e\[x := 62 ] = e\{x := ef) 

Proof By structural induction in e\. The only non-trivial case is e\ = Ay.e^ 
which is handled by a tedious case-splitting on y. The main case \s y ^ x and 
y G FV(c 2 ). Here, the premise of the proposition means that y ^ G&pi^{Xy.e') 
which immediately implies that x ^ FV(e') hy y ^ x. We hence avoid — (— := — ) 
performing a binder renaming. □ 



Lemma 9 — C — C ( — ^q,c) ^ 

Proof The first inclusion follows as the side-condition on — is subsumed 
by the side-condition on Any invoked substitutions thus coincide by Propo- 
sitionlHlwhose premise is established by the latter’s side-condition. The reasoning 
for the second inclusion is analogous. □ 



Lemma 10 ( — +Q,-Symmetry) 

O' 

Lemma 11 H = — = ==« 

Proof From Lemmas EandEI and Lemma cni respectively. □ 



Lemma 12 — C — C — 

Proof The first inclusion follows from Proposition 0 The second follows by 
observing that all the renamings required to perform the /3^-induced substitution 
preserve a^-equivalence, i.e., H-equi valence. By Tyemma, I I 1 1 they can thus be 
expressed by It suffices to observe that no renaming is performed following 

the “passing” of the substitution invoked by the /3-rule. □ 



A™'' ct-Collapses to the Real A-Calculus. With these fundamental results 
in place, we have ensured the intuitive soundness of the following definition — 
which mimics Hindley’s construction. 

Definition 13 (The Real A-Calculus) 

- A = ==„ 

[-J : ^ A 

e I— >■ {e' I e ==a e'} 

- [eJ ~^/3 \f\ ==„; — ==a e' 

Following on from the definition, we see that we have: 

Proposition 14 [ej — [e'J <t4- e (==«; — ^/ 3 ;==a)* e' V e ==„ e' 
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e -h->/3 e ei -h->/3 62 -h+/3 62 

a; -H+/3 X \x.e -h->/3 Aa^.e' 6162 -H+/? ele 2 

ei ei 62 ei FV(e'2) 0 Capt^(ei) = 0 
(Ax.ei)e2 -H+/3 ei[x := ei] 

Fig. 4. The parallel /3-relation: arbitrary, pre-existing /3-redexes contracted in parallel. 



Proof The left-most disjunct is the straightforward transitive version of our 
definition of real /3. The right-most disjunct comes from the reflexive case, again 
by definition. □ 

We thus arrive at the following, rather appeasing, result. 

Lemma 15 [ej [e'J <14 e — ^au /3 e' <14 e --^q,cu/ 3 c e' <14 [ejn — [e'Jn 
Proof From Lemma E3 it is trivial to see that (==«; ==«)* U ==„ = 
-—»au /3 and the first biimplication is established by Proposition O The second 
biimplication follows by Lemmas E| and El The last biimplication follows in 

an analogous manner. □ 



Equivalence of the Raw and the Real Calculi. The technical reason for 
calling the above result “appeasing” is that it allows us to prove the equational 
equivalence results for the raw and the real calculi we have made reference to. 
We consider the second result to be of particular interest. 

Theorem 16 

- (A/=0) = (A™7==„u/ 3) = ==a^U/3c) = ((A™7 ==„h)/=^h) 

— Confl( — -o- Confl( — ^au/?) Confl( — -O- Confl( — 

Proof The first result is immediate following Lemma El As for the second 
result, the definitional totality and surjectivity of [— J and [— Jh combined with 
Lemma El allow us to apply Theorem 01 case 4. □ 

Having thus formally convinced ourselves that we are about to solve the right 
problem, we will now present the details of the confluence proof. 

4 An Equational A™“’-Property and A-Confluence 

As outlined in Sections Q and 0 it suffices to And a raw relation over A™ which 
enjoys the diamond property in order to prove the confluence property for the A- 
calculus. Taking the lead from the Tait/Martin-L6f method, this relation needs 
to contain a notion of parallel /3-reduction. 
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e --->1/3 e e --->i/3 e 6162 --->1/3 e €3 --->1/3 e'3 

X ->1/3 X Xx,e — ->1/3 Ax.e' xe --->1/3 a;e' (6162)63 — ->1/3 6^63 



61 --->1/3 6i 62 --->1/3 62 FV(6'2) n Capt^(6'i) = 0 

(A®. 61)62 --->1/3 e'i[a: ;= 62] 

Fig. 5. The complete development / 3 -relation: attempted contraction of all redexes. 



Definition 17 Parallel j3-reduction, is defined in Figure^ 

The parallel /3-relation admits the contraction of any number (including 0) 
of pre-existing /3-redexes starting from within as long as no variable renaming is 
required. To give an impression of the level of detail of the formalisation, we can 
mention that the property which we need the most in the proof development is 
the following variable monotonicity result about the parallel /3-relation: 

Proposition 18 e -h^/j e' => FV(e') C FV(e) A BV(e') C BV(e) 

In order to employ Takahashi’s Trick, we need to ensure that any considered 
/3-divergence can be resolved by a complete development step. 

Definition 19 Complete /3-development, — ^/j, is defined in Figure\^ 

Observe, informally, that — ^/j only is defined if all (pseudo-) redexes validate 
the side-condition on the /3-rule. Or, more precisely, the relation is defined if 
it is possible to contract all (pseudo-) /3-redexes starting from within — we will 
shortly show that this is indeed possible. For now, we merely present: 

Lemma 20 — ^/ 3 C -«->/3 

Proof Straightforward. □ 



The Overall Proof Structure. Having thus established the basics, we outline 
the proof of the diamond property of the following relation: — before 
supplying the actual details of the proof. The relation is inspired by the defini- 
tional reflection of the weak confluence property for the A-calculus proper over 
the structural (a-)collapse of A™''. In order to use the BVC in our proof, we first 
present it as a predicate on H™, cf. Figure El 

Definition 21 (Barendregt Conventional Form) 

BCF(e) = UB(e) A (BV(e) n FV(e) = 0) 



Lemma 22 o( — ^ q,;-h^/3) 
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Proof 

For the Ms given, we can construct 
the Ns in the divergence resolution 
on the right in order. The ensuing 
sections will detail the individual di- 
agrams. The — ->cto -relation is intro- 
duced in Definition El as the fresh- 
naming restriction of — It serves 
to facilitate the commutativity with 
l 3 on either side of the diagram. We 
note that the result means that it suf- 
fices to address all naming issues be- 
fore the combinatorially more complex 
/ 3 -divergence which can be addressed 
in isolation due to BCF-initiality. 



M 

y \ 






Adi ^ -^0 ^ AA 2 

/ Ckg I CtQ s 

oi > 

M[ % (BCF) M' 






% ^ 












^3 






□ 



4.1 Substitutivity and Substitution Results 

When proving a commutativity result about two relations, you typically proceed 
by rule induction over one of the relations. In what amounts to the non-trivial 
sub-cases of such a proof you therefore typically need to show that a substitution 
from the case-instantiated relation “distributes” over the other relation. Such re- 
sults are called Substitutivity Lemmas. The non-trivial sub-cases of Substitutiv- 
ity Lemmas, in turn, are called Substitution Lemmas. They establish commuta- 
tivity of the substitutions from both the case-instantiations. Substitutivity and 
Substitution Lemmas are non-trivial to prove formally. For our present purposes 
we will merely display one of each to give an indication of the style. The key to 
understanding the following lemmas is the fact that Capt^(ei) fl FV(e2) = 0 is 
the weakest predicate ensuring the correctness of substituting 62 into e\ for x. 

Lemma 23 (Substitution) 

y ^ FV(e2) A X ^ y A (Capt^(e3) nFV(e2) = 0 ) A (Capty(ei) nFV(e3) = 0 ) 
A (Capt3,(ei) n FV(e2) = 0 ) A (Capt2.(ei[y := 63]) n FV(c2) = 0 ) 

ei[y ■■= 63] [a; := 62] = ei[a; := 62] [y := es[x := 62]] 



Lemma 24 (Parallel (3 Substitutivity) 

Cl -s^/3 e( A 62 -H ^/3 e'2 A (Capt3,(ei) n FV(e2) = 0 ) A (Capt2,(e() fl FV(c2) = 0 ) 

6i[a: := 62] -H+y e'-^[x := 63] 

We refer the interested reader to the complete Isabelle/HOL proof develop- 
ment at the homepage of the first author for full details. 
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4.2 Weak cx- and /3-Commutativity 

In this section we prove the lemma that is needed on either side of the diagram 
in the proof of Lemma 1221 In trying to prove a general a and (3 commutativity 
result, we are immediately stopped by the following naming issue: for virtually all 
yl™’'-terms, there exist a-reductions that can invalidate a previously validated 
side-condition on a /3-redex. Fortunately, we can see that the commutativity 
result we need concerns arbitrary /3-reductions but only a-reductions that suffice 
to prove Lemma E3 We therefore define a restricted, fresh-naming a-relation. 
The definition can also be given inductively. 

Definition 25 e — ->ao e' 3z.e --^ia e' t\ z ^ FV(e) UBV(e) 



Lemma 26 




I I 



oi (3 ip 

O I I V ^ 

• o 



Proof By rule induction in — with the induction step going through 
painlessly by freshness of the relevant z. □ 



4.3 The Diamond Property of Parallel /3 Up-to BVC-Initiality 

We will now establish the lower part of the diagram in the proof of Lemma 
It is proved using Takahashi’s Trick, cf. Lemma El Initially, we thus need to 
establish the conditional existence of a non-renaming complete /3-development. 

Lemma 27 (bcF) • Ho 

Proof By structural induction using Proposition d and Lemma d □ 

We stress that the proof is straightforward using the referenced variable 
monotonicity results as is inductively defined to contract from within. No 
complicated considerations concerning residuals are required. However, BCF- 
initiality is crucial for the property. The terms (Xx.Xy.x)y and Xy.{Xx.Xy.x)y 
fail to enjoy free/bound variable disjointness and unique binding, respectively, 
and neither completely develop. BCF-initiality is thus sufficient for the existence 
of a complete development but only necessary in a weak sense: breaking either 
conjunct of the BCF-predicate can prevent renaming-free complete development. 
Still, some non-BCFs completely develop, e.g., (Xx.x)x and Aa;.(Aa;.a;)a;. 

The second of the two required results for the application of Lemma El must 
establish that any parallel /3-step always can “catch up” with a completely de- 
veloping /3-step by a parallel /3-step, with no renaming involved. 

P 

• H • 



Lemma 28 
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Proof By rule induction in — using Tyemma, |‘24l 



□ 



It is interesting that the above property requires no initiality conditions, like 
the BCF-predicate, to be provable — except, that is, from well-definedness of 
--^ 13 - This is mainly due to our use of the weakest possible side-condition on 
/3-contraction to make (3 renaming free (i.e., FV(— ) fl Capt_(— ) = 0). Had we 
instead required that the free variables of the argument were disjoint from the 
full set of bound variables in the body of the applied function (i.e., FV(— ) fl 
BV(— ) = 0), the property would not have been true. A counter-example is 
{\y.{\x.y)z)\z.z. It takes advantage of complete developments contracting from 
within. Contracting the outermost redex first (e.g., by a parallel step) blocks 
the contraction of the residual of the innermost redex when the stronger side- 
condition is imposed: {Xx.Xz.z)z. No variable conflict is created between two 
residuals of the same term due to Hyland’s Disjointness Property HS]I1 



Lemma 29 



(BCF) 



I I 

• o 



Proof From Lemmas E71 and OSI bv using Takahashi’s Trick. Lemma ISl □ 



4.4 Fresh-Naming a- Confluence with BVC-Finality 



The last result we need for the proof of Lemma E3 is the top triangle with its 
leg. We prove it as two results (mainly out of formalisation considerations) — 
the first form suffices by Lemma E3 









• »o(BCF) 

ao ^ ’ 



The proofs do not provide any insights and have been omitted. 



4.5 Confluence 

We have thus completed the proof of Lemma |22l and only one more lemma is 
needed before we can conclude our main result. 

Lemma 30 — — ^a;-s+/ 3 C — ^au/3 

Proof By rule induction observing that both — and -n */3 are reflexive. The 
proofs of the inclusions: — go through straightforwardly. □ 

Theorem 31 (Confluence of the Raw and Real A-Calculi) 

Conji{—^auf3) A Confl{-^/ 3 ) A C'on/i'(— »„cu^c) A Confl{-^pn) 

Proof By Lemmas EIE2I and RHland then Theorem cni □ 



‘Any two residuals of some sub-term in a residual of the original term are disjoint” . 



320 R. Vestergaard and J. Brotherston 



5 Conclusion 

We have completed a confluence proof applying to several raw and real A-calculi. 
It has been done by using first-order induction principles over yl™’' and reduc- 
tion relations, only. It is the first proof we know of which clearly makes the raw- 
/real-calculi distinction. It does so by introducing a new result about preserva- 
tion/reflection of confluence. It is also the first formalised equational result about 
a higher-order language which conducts its inductive reasoning over FOASyW, 
as you do informally by hand. 



A Rational Reconstruction of the BVC. We proved two results about par- 
allel and completely developing /3-reduction, Lemmas^land^ in order to apply 
Takahashi’s Trick. In summary, they say that irrespective of which pre-existing 
/3-redexes in a BCF-term you contract in parallel and without performing re- 
naming, it is possible to contract the residuals of the rest in parallel and without 
performing renaming and arrive at the completely developed term. All in all, 
the residual theory of — in A™’’ is renaming-free up-to BCF-initiality. This is 
partly a consequence of Hyland’s Disjointness Property m and partly due to 
our careful use of substitution. Said differently, Barendregt’s moral: 

“ 2 . 1 . 14 . Using 2.1.12/13 one can work with A-terms the naive way.” 

is formally justifiable and is, in fact, an entirely reasonable way to conduct 
equational proofs about the A-calculus when due care is taken to clarify the raw 
vs. real status of the established property. 
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A Commutative Diagrams 

Formally, a commutative diagram is a set of vertices and a set of directed edges 
between pairs of vertices. A vertex is written as either • or o. Informally, this 
denotes quantification modes over terms, universal respectively existential. A 
vertex may be guarded by a predicate. Edges are written as the relational sym- 
bol they pertain to and are either full-coloured (black) or half-coloured (gray). 
Informally, the colour indicates assumed and concluded relations, respectively. 
An edge connected to a o must be half-coloured. A diagram must be type-correct 
on domains. A property is read off of a diagram thus: 

1. write universal quantifications for all *s (over the relevant domains) 

2. assume the full-coloured relations and the validation of any guard for a • 

3. conclude the guarded existence of all os and their relations 

The following diagram and property correspond to each other (for — >-C A x A). 
(P) • > • Vei, 62, 63 € A. 6 l ^ 62 A 6 l ^ 63 A P( 6 l) 

i i ii- 

• o{Q) 364 G A . 62 — ^ 64 A 63 — ^ 64 A ( 5 ( 64 ) 

We will often leave quantification domains implicit and furthermore assume the 
standard disambiguating conventions for binding strength and associativity of 
connectives. 
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Abstract. In this paper the context- splittable normal form for rewrit- 
ings systems defining Church-Rosser languages is introduced. Context- 
splittable rewriting rules look like rules of context-sensitive grammars 
with swapped sides. To be more precise, they have the form uvw — >■ uxw 
with u, V, w being words, v being nonempty and x being a single letter 
or the empty word. It is proved that this normal form can be achieved 
for each Church-Rosser language and that the construction is effective. 



1 Introduction 



Church-Rosser languages (CRT’s) basically are defined by a length reducing 
string rewriting system and a mechanism to handle the word ends jM N( )88j . 
Here we call such a defining system a Church-Rosser language system (CRTS). 
CRL are a very interesting class of languages for three reasons: (i) Their word 
problem can be decided in deterministic linear time, although they are a strict 
superset of the deterministic context-free languages fDCFLl |MlN()8(^ . (ii) De- 
spite that fact, their definition is more intuitive than that of DCFL. (iii) They 
are the deterministic variant of the growing context-sensitive languages (GCSL), 
which was proved in [1N()98| (for the definition of GCSL see |DW8ti| l. Therefore, 
they fit into the Chomsky hierarchy very well |( IhobhlVlclN 99IBHJN 0()()| . 

By assigning weights to single letters one can also define a weight function 
for words. This is the basis of the following important characterization result 
about Church-Rosser languages: Allowing weight reduction instead of length re- 
duction does not improve the expressive power |N( )f)8j . Given a rewriting system 
with weight reducing rules defining a Church-Rosser language it is possible to 
construct an equivalent one that only has length reducing rules. 

In this paper, we use this fact to show the effective existence of a context- 
splittable normal form for every Church-Rosser language L. The defining (length 
reducing) rewriting system of L can be simulated by a weight reducing sys- 
tem that has rules of the form uvw — >■ uxw with u,v,w being words, v being 
nonempty and x being a single letter or the empty word. We do not use the 
term ‘context-sensitive’ in order to stress the fact that the two forms are not 
fully corresponding to each other. Because context-splittable rule can also be 
deleting (i.e. of the form u —>■□), we do not always get a context-sensitive rule 
by swapping the sides. 
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One consequence of this normal form result is that the information flow 
during reductions is underlying stronger restrictions in a context- splittable CRLS 
(csCRLS): Any movement of a letter in either direction needs at least as many 
rule applications as the distance to be accomplished. Although this is only a 
refinement of the linear time bound for the reductions in CRLS’s it might be 
handy for proofs. 

This paper is organized in the following way: In the next section, we give the 
basic definitions. Section El contains the normal form theorem and the construc- 
tion which is used to prove it. In order to enhance the readability of this paper 
we concentrate on the construction principles and do not give the proof in full 
detail. All details omitted are merely technical, they can be found in jWoiOOaj . 
The last section of this paper contains some concluding remarks which also give 
an insight into possible further consequences of the normal form result, especially 
with respect to GCSL. 

2 Basic Definitions 

The reader is assumed to be familiar with definitions and notations of confluent 
string rewriting. For details, see f,lan88| . [M1N088| . and |IjOH8| . 

A string-rewriting system (or simply rewriting system) i? on if is a subset of 
S* X S*. For {u, v) G R we also write {u ^ v) G R and call (u, v) a rule. 

A weight function is a function / : 17 — >■ N. It is recursively extended to a 
function on S* by f{wx) := f{w) -\- f{x) and /(□) := 0 (where □ is the empty 
word) with w G 17*, x G S. An example for a weight function is the length 
function with f{x) := 1 for all x G S, then f{w) = |w|. 

Throughout this article, a string-rewriting system R is called a weight re- 
ducing system, if there exists a weight function / such that f{u) > f(y) for all 
(u,v) G R. 

Definition 1. A Church-Rosser language system (CRLS) is a 6-tuple C = 
{r, 17, R, ki,kr, y) with finite alphabet R , terminal alphabet S <Z R (R\S is the 
alphabet of nonterminals), finite confluent weight reducing system R R R* x R* , 
left and right end marker words ki,kr G {R \ 17)* fl Irr(R), and accept- 
ing letter y G (T \ 17) fl Irr(R) The language defined by C is defined as: 
Lc := {w G E*\ki -w ■ kr ~^*r y} 

A language L is called a Church-Rosser language (CRL) if there exists a 
CRLS C with Lc = L. 

The definition of Church-Rosser languages is due to McNaughton, Narendran, 
and Otto fMN()88j . The definition of Church-Rosser language systems given 
here is a convenient notation for their definition. Niemann and Otto proved in 
that the expressive power of Church-Rosser languages is not enhanced 
by allowing arbitrary weight functions instead of the length function, so this fact 
is used, too. 

The following definition of a syntactical restriction for CRLS will be proved 
to be a normal form in the main part of this paper. 
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Definition 2. A CRLS C = {F, S ^ R,c\,%,y) is context-splittable {C is a 
csCRLS) if c\,$,y G Irr(R) D F \ S (let the inner alphabet be 

rinner ■= T \ {ci, $, j/}) and foT any rule r € R there exists a splitting (w, v, w, x) 

with: 

F r = {uvw, uxw) 

2. V is non-empty. 

3. uvw may contain at most one ci and if so at its beginning. Also it can have 
at most one $ which only may appear at the end. All other letters of uvw 
have to he from the inner alphabet Finner- 

4-. X is a single letter not equal to ci or $ or it is the empty word. 

5. If V contains ci or $, then x = y, u and w are empty, and v is element of 

C\-rtnner-^- 

6. If X = y, then u and w are empty, and v is element ofc\- 

The splitting (u,v,w,x) of a rule r allowed by this is called a context split- 
ting, u and w are called the left and right context. 



Example F These are some examples for the meaning of the definition (the 
splittings are marked by dots): 

— ab ■ dea ■ ab ^ ab ■ a ■ ab, 

— ab - de- aab — >■ a& • □ • aab is another context-splitting of the same rule, 

— Cl - - $ — >■ Cl - □ • 

— q - ab$ - □ — >• Cl - $ - □ is not a valid splitting, 

— abc — □ is a deleting context-splittable rule, and 

— abed — >■ deba is a rule that is not context-splittable. 



Remark F We want to distinguish the notion of context-splittable CRLS’s from 
that of context-sensitive grammars, because the former uses reductions and the 
latter productions for defining languages. Especially deleting rules of the form 
V — □ have no counterpart in context-sensitive grammars. Therefore we do not 
use the term context-sensitive. Note that all other deleting rules with u or w 
being nonempty can be splitted in a different way such that x is not the empty 
word, just as in the example given above. 



3 The Normal Form Theorem 

Theorem 1. Let C = {F, E, R, ki,kr,y) be a CRLS with language Lq. Then 
there exists a csCRLS C with Lc = Lq. 

Proof. We will give an effective construction for such a new csCRLS C". 
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Remark 2. Although we will not need a formal definition of the automaton model 
for the language family CRL, a variant of two pushdown automata, it will be 
helpful to understand parts of the construction. Informally, these automata work 
as follows (c.f. fljoo82| l: They have two stacks and a scanning head. The initial 
configuration is a left stack with the left end marker worcfl in it, its leftmost 
letter at the bottom of the stack. The right stack contains the input, the leftmost 
letter of the input being on top, and below it the right end marker word. Now, 
letters from the right stack are shifted to the left, until a suffix of the left stack 
is the left side of a reduction rule of the CRTS. This suffix is deleted from the 
left stack and the right side of the rule is pushed onto the right stack. Because 
of the confluence of the underlying string-rewriting system we can make this 
automaton deterministic by choosing only one rule for every left side and by 
using the rule with the shortest left side if two or more different left sides apply. 
The procedure is repeated until the right stack is empty and the left stack is 
irreducible. If the left stack only contains the accepting letter of the CRTS the 
input is accepted. Note that since after a reduction operation the left stack 
will always be irreducible, one can directly combine one shift with each reduce 
operation, as in [MN()88j . 

3.1 The Construction Principles 

Without restriction of generality we may assume that R is length reducing 
ILNU98I . In order to construct a csCRLS C', we will make use of the following 
four principles: 

1. Analogous to the automata model our new system will have the property 
that during the whole reduction process there is always exactly one place in 
the word where the next reduction rule can be applied. 

2. We will use a compression alphabet which can store more than one letter of 
the input (resp. the derivated words) in one letter. This information will be 
represented by subscripts of the compression letters. 

3. These compression letters will be enriched by surplus letters in their sub- 
scripts in order to spread necessary weight reductions over more than one 
letter. 

4. Rules of the original system will, in most cases, be simulated by three or 
four rules in the new system. 

The confluent weight reducing system will be built of five parts i?i to R 5 . 

Definition 3. With F being the alphabet of C which consists of all terminal and 
nonterminal letters, let F be a new alphabet, which is a disjoint copy of F. Then 
~ is the bijective morphism that maps F into F, e.g. a to a. Let jj, ci, and $ be 
new symbols. Let rjj := T U {jl} and Tj := T U {D}. Define 

:= t; • f; n . ((P u .(TUF)- u 

where jj- is a shorthand for {□,!), j)D}. 



1 



Note that both end marker words are irreducible. 
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As can be seen, this regular language consists of words, where between letters 
of -T or r there are always exactly two f’s. This language will not only be used 
to define the compression alphabet, but also to make some definitions during 
the construction process easier. The purpose of the jj’s will be explained later. 

Definition 4. Let = max{|M| | (u,v) £ i?} and fir = max{|ri| | (u,v) € R} 
be the maximum length of the respective rule sides. Because R is length reducing 
fii > Hr holds. Let fi := ma,x{fii,\ki\,\kr\}. The compression alphabet A is 
defined as: 



A := I w G kbjt A 1 < |w| < 3^ + 5}. 

The elements of A are called compression letters. For distinction, letters in 
the index of a compression letter will be called index letters. 



Remark 3. At least /i + 1 letters of the original alphabet T can be stored in one 
compression letter. 



Definition 5. Sometimes it is necessary to extract the information of the sub- 
scripts in a word from Tf . Let ~ be the morphism '' : (A U T U T U {ci, $})* — >■ 
(Tjj U /]))* defined by: 



I w x = £ Fi 

X := < 

I X else 

We assume, without loss of generality, that brackets are not in our alphabets 
so far and use them in the following for better readability. 



3.2 Translating the Input 

The first step in the simulation of C is to translate the input into the compression 
alphabet. At the same time, we will take care of ki and kr. The new end marker 
letters(!) of C will be k[ := ci and k'r := $. 

Short words w £ Lq, |i«| < 2 will be handled separately with a a set i?i of 
rules: i?i := {{c\w$,y) \ w G Lc A |w| < 2} 

Obviously, Ri can be computed easily. 

For translating the input a set of rules i ?2 will be used, which works as 
follows: Decompose ki and kr into single letters in the following way: ki = 
aia 2 ---a|fcj| and kr = ciC 2 • • • C|fc,,|. A will be designed to be a suitable con- 
fluent and weight reducing rewriting system such that for every w £ with 
w = 6 i &2 ■ ■ - bi - ■ • 6|iu|, bi G A(1 < i < |w|) we can make the following reduction 
with A: 
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Furthermore we require that the first translated letter to the right of the 
symbol ci is always produced by the last reduction step. This is necessary to 
give a precise moment in the reduction after which the rule sets defined in the 
following parts of the construction can begin to work. 

Example 2 . Let ki = dd, kr = c, and w = abad. Then R2 translates the input to: 

It is easily verified that it is possible to construct R2 such that i?i U R2 is 
confluent and, with a suitable weight function, weight reducing. 

3.3 Simulating Shift Operations 

The next step is similar to the shift operations of automata for CRT’s. Sometimes 
it is necessary to move right (that is, shift) the position of a possible next 
reduction. 

The right-most overlined letter marks the position at which the simulation 
of C will work with the following rule sets. In general, the right-most overlined 
index letter of a compressed word can be identified with the head position of the 
automaton described above. 

The simulation of shifting is done with a further set of rules which is called 
i?3. A simulated shift has to take place whenever the overlined letters within 
the compression letters form an irreducible string w.r.t. R when the overlining 
is removed. We ommit the details of R3, shifting corresponds to overlining the 
first letter in the indices which is in E. If a shift is necessary but no next such 
letter without overlining exists no next reduction is possible. Then the simulated 
system also would have come to an irreducible word. 

The matter of weight reduction will be discussed later, at the moment simply 
assume that overlined index letters add slightly less to the weight than non- 
overlined ones. 

Example 3 . Assume that bba is irreducible and no suffix of a left-hand side of 
a rule in R. Then a simulated shift is necessary whenever (x G E) 

appears in the indices. For example, one of the necessary rules could be: 

(?ctitt5|)?ti5ti#ati^#°tl’?ctltt5tt^ti5tittatt^#°ti) € -^3 

Again, we do not give the full details of R3, because one can easily see that 
it is possible to construct it such that Ri U R2 U R3 is confluent and weight 
reducing. 

3.4 The “Weight Spreading” Strategy 

Now we come to the core of the construction, which will be called weight spread- 
ing. The main idea is to simulate rules of R piecewise. In order to achieve a 
weight reducing system a second idea is used: the simulation will reduce the 
length of the subscripts of the compression letters. 

First, we provide an example that shows the correct construction at work. 
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Example 4- Assume we have a rewriting system R with (aaaa, bbb) G R. Then 
one case in the simulation is the following: 

^bf,ij^biib^Ua 

The tt’s are used to spread the length reduction of the original rule over the 
compression letters in the simulation. Observe that (especially) in the last step 
of the example the result has three compression letters. More than three letters 
are not necessary during any rule simulation, because a right-hand side of a 
rule and the added j)’s fit into one compression letter (to be exact, sometimes 
some unchanged context appears on the right, making four letters necessary). 
Basically, the weight of a compression letter will be computed from the number 
of its index letters (overlined letters add slightly less to the weight). Therefore in 
the worst case — when the original rule has a length reduction of one — we reduce 
the number of index letters by three and spread this reduction over the resulting 
three compression letters. This is the cause for adding two (t’s after each index 
letter of EU E during the translation with i? 2 - 

In generalising this example, a lot of cases have to be handled. The main 
idea is to identify all possibilities how the left-hand side of an original rule can 
be split over one or more letters of the compression alphabet. Also, sometimes 
it is necessary to allow some unchanged context in the first compression letter. 
Furthermore, the question of finding the right place for the reduction to work 
has to be handled. 



1. “lock” with new nonterminal 

2. change first letter 

3. change middle letter 

4. change last letter, remove lock 



3.5 Case Distinctions 

For lack of space, we will not give the full construction. Instead, the necessary 
distinction of cases and an example how to handle one of these cases are given. 
With this example the reader should be able to understand the principles of the 
construction. The complete construction can be found in the technical report 
llWoiObal . 

Definition 6. The following notation for the set of compression letters whose 
subscript begins with a letter from E will be used: 

^r-r* := | ^ A A ui e T • Ajj*} 
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In some cases, the simulation is very easy. Especially, when the complete 
reduction can be simulated within one letter of the compression alphabet. We 
handle this first case with the following set: 

T\ . — V ^ I ("U, n) (z R 

A W1W2W3 G Wf 

A wi £ rj 

A W 2 = ft A W 2 G F • Fj) • F • {tttt} 

A {{w3 G FF|j* A a: = □) V {ws = □ A a: G {$} U Cr r*)) 

A (a: $ => wiwawsa; G 

Now, for each element of f G Ti assume u = ai - ■ ■ a\u\,ai G F (1 < i < |m|) 
and V = bib2 • ■ ■ b\y\,bi G F (1 < t < |u|). For each t add a rule rt to a new 
system R4 : 



If tui = tU3 = u = □, this would produce a letter ^o, which is not in the 
compression alphabet. In these cases identify = □. Then the rule will be 
simply deleting one symbol. 

Now we come to the difficult part of the construction. What has to be done, 
if the left side of the original rule is not in one letter but distributed over the 
indices of several compression letters? Again we use a set which contains all cases 
of possible rule applications that are not covered by the above set Ti : 

F 2 := ^u,^iu 2^W3^W4‘ ‘ ' ^Wn-2^'Wn-l'Wn^} I (^5^) ^ F 

A 4 < n 

A Wi ■ ■ -w„ G Wn 

A G fJ 

A W 2 W 3 ■ ■ ■ Wn-i G F • Fj • F • {jltt} 

A W2 yf □ 

A W2W3 ■ ■ ■ Wn-l = u 

A {{Wn G F • F„*) V (w„ = □ A X G {$} U Cr-r;)) 

A (x yf $ Wi • • • WnX G W{)} 

For each t G F2 (F2 is finite) a set of rules is added to R4. This also needs some 
further nonterminals. These will be collected in the set F2. Again, we assume 
u = ai‘ ■ ■ a|„| , G F (1 < i < |u|) and v = &162 • ■ ■ b\y \ , G F (1 < z < |u|). The 
following cases have to be dealt with. We do not give all details, but with the 
example after the list of cases the construction should be clear enough. Whenever 
we speak of three or four rules these are a simulation of an original rule in as 
many steps. 

Let t (zz, X, ^WiW2^W3^W4 ‘ ‘ * ^Wn. — 2^'Wn — l'Wn^} ^ ^2 ■ 

1 . n = 4 , ZX3 = □: already covered by the rules for Ti. 

2 . n = 4 , X = □, W3 yf □, all deleting rules need some extra care, four subcases: 
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2.1. u;! = □,W4 = □, one rule which simply deletes two letters: 

■— i^WiW2^W^W4^^ ■ 

2.2. rci □, ru4 = □ use a new nonterminal with — 1 

and three rules. 

2 . 3 . wi = a,w4^a use only one rule: n := {^wiW2iw3W4.x,^4^^x) 

2 . 4 . wi O use a new nonterminal and three rules. 

3 . n = 4 , u □, W3 G {ft, tttt}, similar to the rules for Ti, no subcases, one rule: 
Xt ■= iiwiW 2 ^W 3 W 4 X,^^^l,l(Ub 2 )-iMb\v\)w 2 ^'^ 3 '"‘i^^ 

4 . n = 4 ,u yf D, Ircsl > 2 , (then 1V3 contains at least one letter from F), 
the reduction must be split over two nonterminals. For W2 and W3 there 
exist splittings such that W2 = W2W2 and W3 = W3W3 with W2W3 = jltt. 
Since both W2 and W3 are nonempty, we know that there exists an i, 1 < 
i < |u| with w'2 = OiUtt-’-ai and rUg = Oi+ittU • • • a|„|ttt|. Now there are five 
subcases, depending on the length of Wi, w'^, and i, which we identify using 
the following notation: 

Let k := |u| — |m| + t. If fc > 0 this will be used to calculate a split point 
for the compressed word which is to be substituted. In some cases we need 
a new nonterminal which will be added to F2. Let the weight of be 

■“ ' 4 ^{^W 3 W 4 ) ~ 1- 

4 . 1 . If fc < 0 and wi = □ add the rule 

n ■= i^wiW 2 ^w 3 W 4 X, ■ 

4 . 2 . A: < 0 , w'2 = □, and rci yf □: one new nonterminal and three rules. 

4 . 3 . fc > 0 and w'2 = □: one new nonterminal and three rules. 

4 . 4 . fc < 0 , w'2 yf □, and ici yf □: one new nonterminal and three rules. 

4 . 5 . A: > 0 and w'2 yf □: one new nonterminal and three rules. 

5 . n = 5 , u yf □, W4 = □: already covered by 3 . and 4 . 

6. n = 5 , u = □, W4 yf □ deleting rules, we have four subcases: 

6.1. wi = □,W5 = □, add one rule which simply deletes three letters: 

Xt ■— i^WiW 2 ^'W 3 ^W 4 W 3 X ^ X) 

6.2. wi y^ □, W5 = □: we use a new nonterminal with = V'(Ciu4t05) ~ 1 
and three rules. 

6 . 3 . ici = □,W5 yf □ only one rule: 

Xt ■ (.^•WiW 2 ^W 3 ^W 4 W 3 X ^ ^yj^x') 

6 . 4 . rci yf □,W5 y^ □: one new nonterminal and three rules. 

7 . n = 5 ,w yf □,W4 G {S,ttjl}, the complete reduction takes place in the first 
two nonterminals. This is similar to n = 4 , v yf □,. These are the subcases: 

7 . 1 . W3 = U, then we can make a one rule reduction without new nontermi- 
nal: {^wiW 2^W3^W4W5X, ^wibi{<li<ib2)--{'iib\v\)w3^'^*'"5x) 

7 . 2 . W3 yf jl, then W3 contains at least one letter from F, since W3 = Utt would 
imply W4 ^ {tt,Stt}. 

For 1U2, W3 and W4 there exist splittings such that W2 = w'2w'2, W3 = 
w'^w'^w'^' and W4 = with w'^w'^ = jjjl and w'^'w'4 = UlJ. Since both 

W2 and W3 are nonempty, we know that there exists an t, 1 < z < |u| with 
w'2 = aiUtt---ai and w'3 = Oi+ijltt • • • a|„|. Now there are five subcases, 
depending on the length of wi, w'2, and i. 
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Let k := |t;| — 1^1 -I- 1 . If fc > 0 this will be used to calculate a split point 
for the compressed word which is to be substituted. In all cases with 
more than one rule we use a new nonterminal which will be added 
to F2. 

7.2.1. k <0 and = □ (one rule), 

7.2.2. A: < 0, w'2 = □, and wi ^ U (three rules), 

7.2.3. A: > 0 and w'2 = □ (three rules), 

7.2.4. A: < 0, w'2 yf □, and w\ ^ U (three rules), and 

7.2.5. If A; > 0 and w'2 ^ □ (three rules). 

n = 5, u □, W 3 G {ft, tttt}, |w 4 | > 2, then W4 contains at least one letter from 
r, since |w 4 | > 2. Compute k as above, we have three subcases: 

8.1. A: < 0 and = □ (one rule), 

8.2. A: < 0 and wi ^ O (three rules), and 

8.3. fc > 0 (three rules). 

n = 5,w □, jiusl > 2, |r<; 4 | > 2, (W3 cannot be empty), and W4 ^ 

collection of subcases. In this case W2,w^, and W4 contain at least one letter 
from r. 

For W2, W3 and W4 there exist splittings such that W2 = w'2w'2, W3 = w'^w'^w'” 
and W4 = w'4w'l with w'^w'3 = Dj) and w'^'w'4 = Dj). Since both W2, W3 and 
W4 are nonempty, we know that there exist an i,j,l < i < j < luj with 
w'2 W 3 =ai+4’^j,---a0, and w'{ = a^+itltl • • • a|„|Dt|. 

Let k := |u| — |m| -I- L If A: > 0 this will be used to calculate a split point 
for the compressed word which is to be substituted. Similarly, we will use 
I := |u| — |u| -I- j, note that I > 0 implies |u| > 2, we get the following 
subcases: 

9.1. A: < 0, I < 0, and rui = □ (one rule), 

9.2. A: < 0, I > 0, rci = □, and w'^' = □ (three rules), 

9.3. k < 0, I > 0, Wi = O, and w'^' ^ □ (three rules), 

9.4. A: < 0, ^ < 0, and w\ ^ U (three rules), 

9.5. A: < 0, I > 0, rci □, w'2 = □, and w'3 = □ (four rules), 

9.6. A: < 0, I > 0, iCi □, w'2 = jj, and w'^' = □ (four rules), 

9.7. A: < 0, ^ > 0, ici y^ □, w'2 = jJtt, and w'3 = □ (four rules), 

9.8. k < 0, I > 0, wi ^ O, w'2 = □, and w'3 = jj (four rules), 

9.9. A: < 0, I > 0, ici y^ □, w'2 = jj, and w'3 = jj (four rules), 

9.10. A: < 0, ^ > 0, rci y^ □, w'2 = jjjj, and w'^' = j) (four rules), 

9.11. k < 0, I > 0, wi O, w'2 = □, and w'3 = (jj) (four rules), 

9.12. A: < 0, I > 0, ici y^ □, w'2 = jj, and w'3 = (jj) (four rules), 

9.13. k < 0, I > 0, wi ^ O, w'2 = tttt, and w'F = M (four rules), 

9.14. A: > 0, ; > 0, w" = □, wg' = □ (four rules), 

9.15. k > 0, I > 0, w'2 = □, w'3 = tt (four rules), 

9.16. k > 0, I > 0, w'2 = □, w'3 = tttt (four rules), 

9.17. A: > 0, ^ > 0, icg = tt, w'^' = □ (four rules), 

9.18. k > 0, I > 0, w'2 = tt, w'^' = tt (four rules), 

9.19. k > 0, I > 0, w'2 = tt, w'3 = tttt (four rules), 

9.20. A: > 0, / > 0, wg = ttttj w'3 = d, W3 ^ F, note that under these 

premises I — k > 1, (four rules). 
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9.21. k > 0, 1 > 0, w '2 = Ut '^ 3 ' = n, W3 G r, note that in this case k+1 = I, 
(three rules), 

9.22. k > 0, I > 0, w '2 = Dtt, W3" = j) (four rules), and 

9.23. k > 0, I > 0, w '2 = Dtt, w'^' = ttj) (four rules). 

10. If n > 6 we only compress the information. In consequence, any reduction 
will take place by the rules of the cases for n < 6. 

Consider all t — (^t^ 1 ^Wi'W 2 ^W 3 ^W 4 ‘ ‘ ‘ — — ^ ^2 with 71 ^ 7. 

Then ^w„- 3 Wn -2 £ -^2- So, add the rule: 

{,^Wn-3^Wn-2^Wn — lWn^^'‘‘^Wn-3Wn-2^Wn-\Wn^')- 

3.6 An Example Case 

In order to show how the subcases of the above main cases can be handled, one 
example for the case 9.16. is provided. For details refer to |Woi()()aj . 

Example 5. Consider t = {u, v , ^wiw 2 ^w 3 ^w 4 ,w 3 x) G T2, so u = 5. Assume Iwal > 
2 and |'u;4| > 2. We know W 2 W 3 W 4 = u. Furthermore, decompose u and v into 
letters: u = ui • • • a|„| , u = &i • • • 6|„| . 

Then we know there exist i,j > 0 such that j > i and W 2 = ai---ai-, 
W 3 = Oi+i • --aj, and W 4 = Oj+i • • • a|„|. 

We use i and j to determine how letters have to be spread over the subscripts 
of the new compression letters. Let k := |u| — |it| + i and I := |u| — |it| + j. 
Obviously, I > k always holds. In the following figure 1 the relation between 
i,j,k, and I is illustrated. As one can see, I and k allow to determine over how 
many compression letters the reduction result can be spread with respect to 
W 2 ,w^, and W 4 



W2 = Oi 


• • • a* 


W3 = Gi+1 




W4 = 


Uj + i 


‘ ^\u\ 


h--- 


h 


bi+1 • ■ ■ 


h 


bj+i ■ 


■ b\v\ 


1 > k 


h--- 


h 


^ 2+1 ■ ■ ■ ^I-pI 






1 > 0,k 



h ■ • - 6 |„| 



fc < 0 (1 comp, letter) 



Fig. 1. Identifying subcases for rule simulation with i,j, k, and 1. 



In addition we have to consider the jfs at the split points between W2 and 
W3, respectively between W3 and W4. It is clear that there exist w'2, w'^, w'^, 
UI3, w'^',w'4, and w'{ such that W2 = w'2w'2,ws = w'^w'^w'^' ,W4 = W4IU4, with 
w'-iw's = U and w'^'w'4 = U- 
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Now the subcase 9.16. is identified by I and k as illustrated above, the length 
of W2, and the length of Example 0| above leads to k = 1 , I = 2 , w'2 = □, 
and Wg' = UjJ. In general in this case we need a new nonterminal which is 
added to I2 and the following four rules: 

WiW2^W3^W4W5^-} ^WiW2 

1’t,2 '■= 

'■= (Cu,jf,j(jjtjb2)...(U|)bj^)jjDC&|c + l(titt&lc + 2)---(tt#6i + l)^*^’ 

The following figure 2 illustrates the distribution of letters in the simulation, 
the length of the boxes is indicating their weight. Of the surplus letters | only 
the most important are explicitly shown. 



W2 = aiU---ai 


W 3 = ttttai+i • • • ajU 


W4 = Oj+i • • • a|„|ttjl I left-hand side of rt^i 


W2 = 


W 3 = ttUoi+i • --ajU 


6 


right-hand side of rt^i 
right-hand side of rt^2 
right-hand side of 3 


6itttt - • • 




'^3 ~ tttt^i+1 ■ ■ ‘ 


6 


hU-"hU 






6 


hU-‘-hU 

j 






right-hand side of 4 



Fig. 2. Distribution of letters in the simulation. 



We have seen how to handle case 9.16. In the other cases, the rules are derived 
in a similar way. 

In order to assure confluence, the following changes to i?4 will be used. First 
of all, remember that the original rewriting system is confluent. That means, 
whenever two or more rules may be applied to one word, it does not matter 
which of them is chosen. Especially, with respect to acceptance or rejection of 
words the choice does not change the result. Now assume we had i?4 constructed 
as above. Then there may be rules that have overlapping left-hand sides. Note 
that these overlaps are always such that one left-hand side is a suffix of the 
other. This is due to the unique place of possible reductions obtained by the 
overlined index letters. Depending on the cases the possibly conflicting rules are 
derived from, we cannot control the resulting distribution of index letters over 
the compression letters. Therefore, i?4 might not be confluent. Instead, we drop 
some of the rules in three steps: 
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1. Omit all compressing rules (case 10. above), for which a left-hand side of a 
simulating rule is a suffix of or identical to the left-hand side of the com- 
pressing rule. Note that the compressing rule in that case is not necessary 
because the left-hand side of any simulating rule cannot be longer than that 
of a compressing rule. 

2. Whenever two rules have the same left-hand side, chose one of them (arbi- 
trarily). Because of the first step those rules always are either a single rule 
simulating a rule of R or the first of a sequence doing such a simulation. 
Therefore, if the rules dropped contain letters from T2, we may also drop 
their successor rules. 

3. Whenever one left-hand side of a rule is a suffix of the left-hand side of another 
rule, drop the rule with the longer left-hand side. Again, if we drop the start 
of a simulation sequence, the successors can be dropped, too. 

It is important that dropping these rules does not change the accepted lan- 
guage, and that we may only do that because R is confluent. 

After that process of omitting rules, R 4 has no overlaps of left-hand rule sides 
any more. So, R 4 itself is confluent (there can be no critical pairs, compare to 



3.7 Weight Reduction 

In order to show that R 4 is weight reducing, we need a suitable weight function. 
The idea is to distribute the weights of the right-hand rule sides v in the original 
system R over two or more compression letters. Therefore the strategy for the 
construction is called “weight spreading”. We only give the part of a weight 
function ip that is defined for Ri. The weights for E, {ci, $}, and for I2 can be 
easily found based on this. Let ip{x) : Wj — >■ N be the weight function defined 



Then ip{x) : A — >■ N is defined by ip{^w) '■= <p{w) + 1- The following property 
can be verified easily: 



So, Ip is the Fi part of the required weight function. For 1)2 the fact can be 
used that all weights are odd, so /2 letters will have even weights just fitting 
“in between” . 

3.8 Final Rules 

The last step is to define rules that accept the result of a reduction: 



lEHHii). 



by: 




Claim. For all £,v,^w £ A: l^’l > |w| 4’iiv) > ’tpi^w) and G A 



i?5 := {(c[w$,?/)|'u; G ,w = yU} 
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3.9 Correctness of the Construction 

Lemma 1. Let F' := i7U{ci, $, ylU-TiUrij, S' := S, R' := i?iUi? 2 Ui? 3 Ui? 4 Ui? 5 . 
Then C := {F' , S' , i?', q, $, y) is a csCRLS and Lc = Lq- 

Proof. 

1. To check if C is well defined we only have to make sure that nowhere a 
would be necessary in R 4 . By observing the relations between the subscript 
word lengths and n, i,j, k, and I of the subcases described above this can be 
shown to be true. 

2. Checking the weight reduction of the rules is a rather tedious effort, but 
straight forward. 

3. First, observe that the first three parts i?i U i ?2 U R 3 can be constructed to 
be confluent. Second, reduction rules of i ?4 cannot overlap with those rules 
or with rules from i?s (the latter because y G lRR(i?)). Overlaps between 
rules within R 4 cannot happen. (Note that as soon as a is introduced there 
is always exactly one next rule that can be applied until the ft is removed 
again. Especially, no such rule with ft can be applied twice.) So, finally R' 
is confluent. 

4. There exists a cover (comparable to the covers for context-free grammars 
discussed by Nijhollll) between accepting leftmost reductions that start with 
words in S* in R and accepting reductions in R' on the same words. In 
consequence, both CRLS‘s accept the same languages. 

5. Checking the context-splittability is the easiest part, it can be verified by 
simply looking at all rules. 

With this lemma, the proof of the normal form theorem is complete. □ 

4 Conclusion and Further Questions 

This paper shows that a normal form similar to context-sensitive grammars can 
be established for systems defining CRT’s. The initial motivation of the author 
was to answer a question raised in the context of systems for prefix languages 
of CRT’s iWoiOObj . There, a construction for systems defining prefix languages 
of CRT’s was given which depends on the existence of the context-splittable 
normal form (it was called prefix- splittable at that time). 

The normal form theorem is a very strong hint that the CRTS given by the 
construction in |Woi00bj cannot always be proved to be correct or false. This 
conjecture is due to the fact that CRT are a basis for the recursively enumerable 
languages. That means, given an alphabet S (q, ^ ^ S) and any r.e. language 
L C S* there is a CRT L' C S* ■ {q} • {j)}* such that deleting the letters q and j) 
with a homomorphism h which leaves letters of S unchanged leads to h{L') = L 
(see also |OKK97] b 

^ A. Nijholt. Context-Free Grammars: Covers, Normal Forms, and Parsing. Springer- 
Verlag, 1980. 
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Although the normal form theorem shows that in principle and constructively 
it is possible to find a prefix splittable system for every CRL, there can be 
conflicts with the prefix construction. This is a line of further research. 

One can see that the piecewise simulation of rules leads to a rewriting system 
that has reducible right-hand sides. A rewriting system that has no such rules 
is called interreduced. The context-splittable normal form and the property of 
being interreduced seem to be dual to each other: We conjecture that there is 
a CRL that does not have an interreduced csCRLS. Besides, the construction 
of a csCRLS makes heavy use of weight reduction. This makes the existence of 
a length reducing context-splittable normal form doubtful: We conjecture that 
there is a CRL that has no length reducing csCRLS. 

As mentioned above, the language classes CRL and the deterministic growing 
context-sensitive languages fDGCSL^ are identical. The latter can be described 
by shrinking deterministic two pushdown automata sDTPDA whose definition 
is very similar to the automata mentioned above. The main differences are that 
they have bottom symbols and a state, that they can look at and replace both 
(single) top symbols of the stacks by arbitrarily long words (as long as the weight 
of the configuration shrinks), and the slightly different mode of accepting words. 
Niemann and Otto showed in mm that this model is equivalent in its power 
of accepting languages to the definition of CRT’s. 

By dropping the condition of confluence we obtain the class of the growing 
context-sensitive languages (GCSL, defined in |DW86| 1 from the class CRL. We 
conjecture that a normal form corresponding to the one established here for CRL 
also holds for GCSL (thus justifying the use of the term context-sensitive) . This 
would imply that the class of the acyclic context-sensitive languages (ACSL) 
coincides with GCSL (see |Bunt)ti| L 

Furthermore, by the construction given in this paper, it should be possible 
to require that the automata for CRL and GCSL are not allowed to replace the 
top symbols with arbitrarily long words. Instead, except for shifting operations, 
at most one letter per stack should suffice. 
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Abstract. We propose simply typed term rewriting systems (STTRSs), 
which extend first-order rewriting by allowing higher-order functions. We 
study a simple proof method for confluence which employs a characteri- 
zation of the diamond property of a parallel reduction. By an application 
of the proof method, we obtain a new confluence result for orthogonal 
conditional STTRSs. We also discuss a semantic method for proving ter- 
mination of STTRSs based on monotone interpretation. 



1 Introduction 

Higher-order function is one of the useful features in functional programming. 
The well-known higher-order function map takes a function as an argument and 
applies it to all elements of a list: 

wap / [ ] = [ ] 

map f {x xs) = f X : map f xs 

It is not possible to directly express this definition by using a first-order term 
rewriting system, because the variable / is also used as a function. In order to 
deal with higher-order functions, one can use higher-order rewriting (e.g. Com- 
binatory Reduction Systems IKCTTI . Higher-Order Rewrite Systems EIH). 
Higher-order rewriting is a computation model which deals with higher-order 
terms. Higher-order functions and bound variables are usually used for con- 
structing the set of higher-order terms. The use of bound variables enriches the 
descriptive power of higher-order rewrite systems. However, it makes the the- 
ory complicated. The aim of this paper is to give a simple definition of rewrite 
systems which conservatively extend the ordinary (first-order) term rewriting 
systems in such a way that they can naturally express equational specifica- 
tions containing higher-order functions. We propose higher-order rewrite sys- 
tems which are close to the format of functional programming languages with 
pattern matching. Based on the new definition of higher-order rewrite systems 
(given in the next section) we investigate the confluence property of the rewrite 
systems. We first study a method for proving confluence using parallel reduction 
(in Section 3). Based on the proof method introduced, we prove confluence of 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 338-|HSll 2001. 

@ Springer- Verlag Berlin Heidelberg 2001 
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orthogonal higher-order rewrite systems (in Section 4). This confluence result 
is further extended to the case of conditional higher-order rewriting (in Section 
5). We also discuss a semantic method for proving termination of the newly 
proposed rewrite systems (in Section 6). 

2 Simply Typed Term Rewriting Systems 

We assume the reader is familiar with abstract rewrite systems (ARSs) and 
(first-order) term rewrite systems (TRSs). In this section we propose a sim- 
ple extension of first-order rewrite systems (cf. iMnH]) to the 

higher-order case. The basic idea behind the following construction of terms is 
the same as typed combinatory logic (see e.g. ESHU) : variables may take argu- 
ments, and types are introduced in order to apply a function to its arguments 
correctly. 

Definition 1 (simply typed term, position, substitution) 

The set of simple types is the smallest set ST which contains the base type o and 
satisfies the property that ti x • • • x — >■ tq G ST whenever tq , ri, . . . , G ST 
with n > 1. We call a non-base type a function type. Let be a set of vari- 
able symbols of type r and be a set of constant symbols of type t , for every 
type r. The set 'T{V,CY of (simply typed) terms of type r is the smallest set 
satisfying the following two properties: (1) if t G V'^ U then t G T{V,CY, 
and (2) if n > 1, to G T(R, and U S T(V,CY' for i = 1 ,... ,n, 

then (to ti • • ■ t„) G T(R, CY- Note that r is not necessarily the base type. We 
also define R = U.gsxfo^C' = UeST^^^and T(R, C) = U,gSTT(fo,C)" . In 
order to be consistent with the standard definition of first-order terms, we use 
to(ti, . . . , tn) as an alternative notation for (to t\ ■ ■ ■ tY)- The outermost paren- 
theses of a term can be omitted. To enhance readability, infix notation is allowed. 
We use the notation f to make the type r of a term t explicit. 

We do not confuse non- variable symbols with function symbols as in the first- 
order case, because a variable symbol of non-base type expresses a function. The 
term to in a term of the form (to ti ■ ■ ■ t„) expresses a function which is applied 
to the arguments ti, • • • ,t„. The construction of the simply typed terms allows 
arbitrary terms, including variables, at the position of to, while only constant 
symbols are allowed in the first-order case. Thus a set of first-order terms is 
obtained as a subset of simply typed terms which satisfies both = 0 whenever 
T Y o, and _ 0 whenever Ti Y o for some i or t Y o. 

Let t be a term in T(V, C). The head symbol head(t) of t is defined as follows: 
(1) head(t) = t if t G R U C, and (2) head(t) = to if t = (to ti • • • t„). The set of 
variable symbols occurring in t is denoted by Var(t). 

A position is a sequence of natural numbers. The empty sequence e is called 
the root position. Positions are partially ordered by < as follows: p < q ii there 
exists a position r such that pr = q. The set of positions in a term t is denoted 
by Pos(t). The subterm t|p of t at position p is defined as follows: (1) t|p = t 
if p = e, and (2) t|p = if t = (to ti • • • tn) and p = iq. The term obtained 
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from a term s by replacing its subterm at position p with a term t is denoted 
by s[t]p. The set Pos(t) is divided into three parts. We say p G Pos(t) is at a 
variable position of t if t|p G V, at a, constant position if t\p G C, otherwise p 
is at an application position. We denote the set of variable positions, constant 
positions, and application positions in a term t by PoSv(t), PoSc(t), and PoSa(t), 
respectively. 

A substitution cr is a function from ^ to T (V, C) such that its domain, defined 
as the set { cr G y | a{x) yf x }, is finite and a{x) G T(F, Cy whenever x G V'^ . 
A substitution cr : F —>■ T(F, C) is extended to the function a : T(F, C) — >■ 
T{V,C) as follows: (1) a{t) = a{t) ii t G V, (2) a{t) = t ii t G C, and (3) 
a{t) = {a{to) ^{ti) ■ ■ ■ ^{tn)) ii t = {toil - ■ ■ tn). We will write ta instead of a{t). 
A renaming is a bijective substitution from V to V . Two terms t and s are 
unifiable by a unifier a ii sa = ta. 

Definition 2 (rewrite rule and rewrite relation) 

Let T(V,C) be a set of simply typed terms. A rewrite rule is a pair of terms, 
written as / — >■ r, such that Var(r) C Var(Z), head(Z) G C, and I and r are of the 
same type. The terms I and r are called the left-hand side and the right-hand 
side of the rewrite rule, respectively. Let i? be a set of rewrite rules. We call 
TZ = {R, V, C) a simply typed term rewriting system (STTRS for short). 

Let TZ = {R, V, C) be an STTRS. We say a term s rewrites to t, and write 
s t, if there exists a rewrite rule I ^ r G R, a position p G Pos(s) and a 
substitution a such that S|p = la and t = s[ra]p. We call the subterm S|p a redex 
of s. In order to make the position p of a rewrite step explicit, we also use the 
notation s t. Especially, a rewrite step at root position is denoted by A-tj 
and a rewrite step at non-root position is denoted by When the underlying 

STTRS TZ is clear from the context, we may omit the subscript TZ in 

The definition of STTRS is close to the algebraic functional systems defined 
by Jouannaud and Okada The main difference is that we dispenses with 

bound variables since our focus is on higher-order functions and not on quan- 
tification. It is easy to see that every variable symbol is in normal form because 
of the second restriction imposed on the rewrite rules. The following example 
shows that a simple functional programming language with pattern matching 
can be modeled by STTRS. 

Example 3 (functional programming by STTRS) 

Let C be the set which consists of constant symbols 0°, []°, 

map(o^o)xo^o, q(o^o)x(o^o)^(o^o)^ where : and o are 

infix symbols, and the set of variable symbols V contains x°, xs° , F°^°, and 
We define the set of rewrite rules R as follows: 




map F[] ^0 

map F {x : xs) ^ F x : map F xs 

{FoG)x ~^F{Gx) 



F o F 




Confluence and Termination of Simply Typed Term Rewriting Systems 341 



Examples of rewrite sequences of the TRS TZ = {R, V, C) are: 



map (twice 


S) 


(0: 


D) 


-^n 


map (S c 


>S) (0:D) 










~^n 


(SoS) 0 


: map (S o S) [] 










~^n 


S(S0) : 


map (S o S) [] 










~^n 


S(S0) : 


[] 


map (twice 


S) 


(0: 


D) 


~^n 


(twice S) 


0 : map (twice S) [] 










~^n 


(SoS) 0 


: map (twice S) [] 












S(S0) : 


map (twice S) [] 










~^n 


S(S0) : 


[] 



where the underlined redexes are rewritten. 

3 Confluence by Parallel Moves 

In this section, we develop a method for proving confluence using parallel reduc- 
tion. We assume the reader is familiar with the basic notions of abstract rewrite 
systems (ARSs). For more detailed descriptions on abstract rewriting, see, for 
example, Klop’s survey EM. 

Definition 4 (properties of an ARS) 

Let A = (A,— >■) be an ARS. We use the following abbreviations: 



property 


definition 


abbreviation 


A has the diamond property 




OH) 


A is confluent 


*i C 


CR(^) 



Lemma 5 (confiuence by simultaneous reduction) 

Let (A, — >•) and (A, be ARSs. 

(1) 0(^)^CR(^). 

(2) If -T* = then CR(-r) CR(^). 

Proof. Straightforward. □ 



Definition 6 (parallel reduction) 

Let TZ = (R,V,C) be an STTRS. The parallel reduction relation induced by TZ 
is the smallest relation such that 

(1) if t G R U C then t t, 

(2) if s Atj t then s t, and 

(3) if n > 1 and Si ti for i = 0, . . . , n, then (sg si • • • Sn) (io b ■ ■ • tn). 
We may omit the underlying TRS TZ in -||>.^ if it is not important. 

One can easily verify that every parallel reduction relation is reflexive. Note 
also that — t C C — >■* , hence = — >■* . From Lemma El we know that the 
diamond property of a parallel reduction relation is a sufficient condition for the 
confluence of the underlying TRS: 0(-f|t) CR(-fjt) 4=^ CR(— :>). The following 
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lemma gives a characterization of the diamond property of a parallel reduction, 
which is inspired by Gramlich’s characterization of the strong confluence of a 
parallel reduction |Grahti| . 

Lemma 7 (parallel moves) 

We have 4|f ■ A- c -[}> • cjf <;=^ ^ A ’ A ■ 

Proof. The implication from right to left is obvious by A C -||>. For the reverse 
implication, suppose 4|f • A C -||> • . We show that if t 4|f s -f|> m then there 

exists a term v such that f -f|> u -fjf u, for all terms s, t, and u. The proof 
is by induction on the structure of s. We distinguish three cases according to 
the parallel reduction s t. If s = t G V VJ C then t -ff> u = m by taking 
u = u. If s A t or s A M then we use the assumption. Otherwise, we have 
s = (sq Si ■ ■ • Sra), t = (to G • • ■ tn), and u = (uq ui . . . Un) for some n > 1 with 
ti A Si A (* = 0, . . . , n). By the induction hypothesis, we know the existence 
of terms Vi {i = 0, . . . ,n) such that ti A ^i A '^i- Let v = {vq Vi . . . u„). Then 
we have t A A "a by definition. □ 

This lemma allows us to partially localize the test for the diamond property 
of a parallel reduction, though the complete localization is impossible as shown 
in the following example. 



Example 8 (complete localization of parallel moves) 

In this example we show that the implication ^ • A C A ■ A A ’ A C 
A • A does not hold in general. Let V = 0 and C be the set consisting of the 
constant symbol f of type o x o — >■ o and constant symbols a, b, c, d, e of type o. 
Consider the set R of rewrite rules defined by 



R = 



faa— T^c a— >b 
fab— c— >d 
fba— ^d d— 
f b b — > e 



It is easy to see that the inclusion ^ • A C A ’ A holds. We have f b b A 
f a a A c but f b b A ' A satisfied. 



Definition 9 (parallel moves property) 

We say an STTRS satisfies the parallel moves property, and write PM(— >■), if the 
inclusion A ’ A C A • A holds. 

Lemma0 states that the parallel moves property is equivalent to the diamond 
property of a parallel reduction. The parallel moves property is a useful sufficient 
condition for proving the confluence of orthogonal rewrite systems. 

Lemma 10 (confluence by parallel moves) 

Every STTRS with the parallel moves property is confluent. 

Proof We have PM(— >•) <;=^ <^(A) CR(— >■) by Lemmata 0 and El with 

A* = see FiglD □ 
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> 



O(^) 







Fig. 1. Confluence by parallel moves 



4 Confluence of STTRSs 

In this section, we give a simple proof of the confluence of orthogonal STTRSs 
based on the parallel moves property. We are interested in orthogonal STTRSs, 
which are of practical importance, although there are confluence results obtained 
by weakening orthogonality, see, for example, EueHOl. For the definition of or- 
thogonality, we need the notions of left-linearity and overlap. 

Definition 11 (left-linearity) 

An STTRS is left-linear if none of left-hand sides of rewrite rules contain multiple 
occurrences of a variable symbol. 

Definition 12 (overlap) 

An STTRS is overlapping if there exist rewrite rules I — >■ r and I' — >■ r' with- 
out common variable symbols (after renaming) and a non-variable position 
p G PoSc(0 U PoSa(0 such that 

• l\p and I' are uniflable, and 

• if p = e then Z — > r is not obtained from I' — > r' by renaming variable 
symbols. 



Example 13 (overlapping STTRS) 

Let C be the set consisting of constant symbols f,g, h of type o ^ o and a, b of 
type o. Let V contain variable symbols F of type o — )> o and x of type o. We 
define 



i? = 



f (Fx) ^ Fb) 

g a -> h b J ■ 



The STTRS TZ = {R, V, C) is overlapping because the left-hand sides of the 
rewrite rules are uniflable at an application-position. In this STTRS the term 
f (g a) has two different normal forms: g bT^^f (g a) — >7^ f (h b) — ^>7^ h b. Note 
that the redex (g a) in the initial term is destroyed by the application of the first 
rewrite rule. 



Definition 14 (orthogonality) 

An STTRS is orthogonal if it is left-linear and non-overlapping. 

Now we are ready to give a proof that orthogonal STTRSs are confluent. We 
first extend the use of parallel reduction to substitutions. 
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Definition 15 (parallel reduction of a substitution) 

Let cr and r be substitutions and X be a set of variable symbols. We write 
cr if ''■( 2 ;) for all x G X. 



Lemma 16 (parallel reduction of a substitution) 

Let a and r be substitutions and t be a term. If a Var(t) C X then 

ta -f|> tr. 

Proof. An easy induction on the structure of t. □ 



Lemma 17 (key properties for confluence) 

Let TZ = {R, V, C) be an orthogonal STTRS. 

(1) ^ C =. 

(2) For all rewrite rules I ^ r G R, substitutions cr, and terms t, if t cjf la ra 
and not la t then there exists a substitution r such that t = It with 

Proof. 

(1) Since TZ is non-overlapping, we can only use (two renamed versions of) the 
same rewrite rule in R for rewriting a term at the same position. Hence we 
always obtain the same term. 

(2) Since 7Z is non-overlapping, there is no term s' with fpa -4 s', for all non- 
variable position p in 1. Hence we have fpa -f|> t|p for all variable positions 
p in 1. Define the substitution r by t{x) = t\p if fp = x and t{x) — x 
otherwise. This substitution is well-defined because there are no multiple 
occurrences of a; in / by left-linearity. It is easy to see that a -f|>[var(/)] 

t = It hy construction. 

□ 



Lemma 18 (Parallel Moves Lemma) 

Every orthogonal TRS TZ = (R,V,C) has the parallel moves property, i.e., 
4- 

Proof. Suppose t 4|f s u. We show t -f|> • cjf u. If s -4 t then the desired result 
follows from Lemma irm . Otherwise, we do not have s 4 t. Since there exists a 
rewrite rule I ^ r G R and a substitution a such that s = la and u = ra, we know 
the existence of a substitution r such that t = It with a 4[var(z)] ’’’ Lemma 

EJEJ. Therefore, < = Zr rr cjf ra = u hy Lemma El and Var(r) C Var(Z). 
Note that the case s = t gV is impossible because every variable symbol is in 
normal form and that the case s = t G C is contained in the case s u. □ 

Note that the Parallel Moves Lemma does not hold for orthogonal higher- 
order rewrite systems with bound variables as observed in the literature, see 
and proof of confluence in such rewrite systems can be found 

in [M NhSj . Now we conclude this section by the main result of this section and 
an example of its application. 
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Theorem 19 (confluence by orthogonality) 

Every orthogonal TRS is confluent. 

Proof. By Lemma [Tni and the Parallel Moves Lemma (Lemma □ 



Example 20 (confluence by orthogonality) 

The example STTRS given in Example 0 is confluent because it is orthogonal. 

5 Confluence of Conditional STTRSs 

In this section, we generalize the confluence result presented in the previous sec- 
tion to the case of conditional rewriting. Bergstra and Klop proved the confluence 
of first-order orthogonal CTRSs in iEKm . Their proof depends on the notion 
of development and the fact that every development is finite. Our result in this 
section generalizes their result to STTRSs and also simplifies their confluence 
proof, based on the parallel moves property. 

Definition 21 (conditional rewrite rule) 

Let T(V, C) be a set of terms. A conditional rewrite rule I ^ r ^ c consists of 
a rewrite rule I — > r and the conditional part c. Here c is a possibly empty finite 
sequence c = ~ ri, ...,?„ rs of equations such that every pair of terms U 

and ri are of the same type and Var(r) C Var(c). If the conditional part is empty, 
we may simply write I — >■ r. Let i? be a set of conditional rewrite rules. We call 
TZ — (R,V,C) a conditional STTRS. 

A variable in the right-hand side or in the conditional part of a rewrite rule 
which does not appear in the corresponding left-hand side is called an extra 
variable. In this paper, we allow extra variables only in the conditional part but 
not in the right-hand sides of rewrite rules. 

Definition 22 (rewrite relation) 

The rewrite relation of a conditional STTRS TZ — {R, V, C) is defined as 
follows: s t if and only if s ^ for some k > 0. The minimum such k 

is called the level of the rewrite step. Here the relations ^re inductively 
defined: 



-^TZo — 

= { {t[lcr]p,t[ra]p) \ I ^ r ^ c & R, ca C }. 

Here ca denotes the set { /V ~ rV | V k, r' belongs to c}. Therefore ctr C 
~^nk c = ~ ri, . . . , R! Tn is a shorthand for l\ — ?"i, . . . , 

We may abbreviate to — if there is no need to make the underlying con- 
ditional STTRS explicit. 

Properties of conditional STTRSs are often proved by induction on the level 
of a rewrite step. So, it is useful for proving confluence of orthogonal conditional 
STTRSs to introduce the parallel reduction relations which are indexed by levels. 
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Definition 23 (parallel reduction relations indexed by levels) 

Let TZ he & conditional STTRS. We define as the smallest relation such 
that 

(1) t t for all terms t, 

(2) if s -^TZk t thou s 

(3) if fc > ji and Si for i = 0, . . . , n, then (sq si • • • Sn) (fo ti-“ tn)- 

We may abbreviate to when no confusion can arise. 

Observe that s t if and only if s t for some level k > 0. It is also 
easy to verify that -^Uk ^ '^iik — levels fc > 0. 

Definition 24 (properties of an ARS with indexes) 

Let A = (A, Ujgj — >-i) be an ARS whose rewrite relations are indexed. We use 
the following abbreviations: 



definition 


abbreviation 


j-e- ■ — C — • j-ir- 


oi(^) 


*4— ■ c 


CRi(^) 



Lemma 25 (confluence by simultaneous reduction) 

Let {A, — >-i) and (A, ^i) be ARSs such that = — >■* for all i G I. 

Weh&veOii^) CR^H)- 

Proof. Straightforward. □ 



Definition 26 (parallel moves property for conditional STTRSs) 

We say a conditional STTRS satisfies the parallel moves property with respect 
to levels j and k, and write PM(, (—>•), if the inclusion • Afc C holds. 



Lemma 27 (parallel moves for conditional STTRSs) 

The following two statements are equivalent, for all m > 0. 

(1) PM-i(— >•) for all j, k with j + k < m. 

(2) 0^(-f|>) for all j, k with j + k < m. 

Proof. The implication (2) (1) is obvious because Afe C by definition. 

For the proof of the implication (1) (2), suppose statement (1) holds. We 

show that if t s -f|>^ u and j + k < m then there exists a term v such that 
t -f|>^ for all terms s, t, u and levels j, k. The proof is by induction on 

the structure of s. We distinguish three cases according to the parallel reduction 
s t. If s = t then t -f|>^ v = u by taking v = u. If s -^j t or s u then we 
can use the assumption (I) in both cases because j + k < m. Otherwise, we have 
s = (so Si • • • s„), t = (foil - ■ ■ tn), and u = (uq ui . . . u„) for some n > I with 
ti Si Ui(i = 0, . . . , n). Since ji+ki < j+k < m, the induction hypothesis 
yields the existence of terms Vi such that U Vi Ui, for i = 0, .. . ,n. Let 
V = (vo vi . . . Vn). Then we have t -f|>^ by definition. □ 
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The following lemma gives a sufficient condition for the confluence of CTRSs. 

Lemma 28 (confluence by parallel moves) 

If a CTRS satisfies PM^(— )•) for all levels j and k then CR-i(— >■) holds for all j 
and k, hence it is confluent. 

Proof. By Lemmata 071 and with -f|>* = — >•* for all levels i, see FigJ^l □ 



Vj, k j + k < m 



k : 

^ j \ 3 

-ll-v 



Vj, k j + k < m 

^ • 

k 1 



^ j \ 3 



Vj, k j k < m 
* 

^ • 

k I 



> • 

k 



Vj, k j + k < m 
* 




Fig. 2. Confluence of CTRSs by parallel moves 



Imposing restrictions on reducibility of the right-hand sides of the conditions 
is important for ensuring the confluence of conditional STTRSs. 

Deflnition 29 (normal conditional STTRS) 

Let 72. be a conditional STTRS. A term t is called normal if it contains no 
variables and is a normal form with respect to the unconditional version of 72. 
Here the unconditional version is obtained from 72 by dropping all conditions. 
A conditional STTRS is called normal if every right-hand side of an equation in 
the conditional part of a rewrite rule is normal. 

We extend the indexed version of a parallel reduction to the relation 
substitutions as in the unconditional case ('Definition 1151 . It is easy 
to verify that the level version of Lemma El also holds. Now we are ready for 
proving the Parallel Moves Lemma for conditional STTRSs. In the conditional 
case, we must confirm that the conditions are satisfied after the change of the 
substitution. 

Lemma 30 (Parallel Moves Lemma for conditional STTRSs) 

Every orthogonal normal conditional STTRS 72 = (77, F, C) satisfies PM-^(— >■), 
i.e., • A-k C -jTlf , for all levels j and k. 

Proof. We show that if 7^^ s Ak u then there exists a term v such that t -f|>^ 
V jA The proof is by induction on j + k. The case j -I- fc = 0 is trivial because 

-^0 = 0- Suppose j -I- fc > 0. We distinguish two cases. If s Aj t then we have 
s = u because 72 is non-overlapping and has no extra variable in the right hand 
sides of R. Hence we can take v = s = u. Consider the case that s Aj t does 



348 



T. Yamada 



not hold. From s -^k u we know that there exists a conditional rewrite rule 
I ^ r c € R and a substitution a such that s = la, u = ra, and ca C— 
Since TZ is non-overlapping, there is no term s' with /|pcr s' for all non- 
variable positions p in 1. Define the substitution r by t(x) = t\p if l\p = x and 
t{x) = a{x) otherwise. This substitution is well-defined by the left-linearity of TZ. 
We have a -fttj[var(z c)] t = It hy the definition of r. From Var(r) C Var(Z) 

and the level version of Lemma EH we obtain ra -f|>^ rr. It remains to show 
that t = It rr. So, we will prove cr C — Let I' ~ r' be an arbitrary 
condition in c. We have to prove that I't — t't. Since ca C— we have 
I'a — r'a. Moreover, a -ft>j[var(z c)] Var(c) C Var(/, c), and the level version 
of Lemma El yields that I'a I't. Hence I't I'a -^X-i From the 
induction hypothesis and Lemmata EDI and we know the existence of a term 
V such that I't rV. Because TZ is normal, r'a = r' = r'r = v. Hence 

I't — r'r. Therefore It rr. □ 



Theorem 31 (confluence by orthogonality) 

Every orthogonal normal conditional STTRS is confluent. 

Proof. By Lemma EH and the Parallel Moves Lemma for conditional STTRSs 
Hyemma 1.31)1) . □ 



6 Termination of STTRSs by Interpretation 

In this section, we discuss how to prove termination of STTRSs by semantic 
method. We adapt the semantic proof technique both in the first-order case 
and in the higher-order case |vdP94| . In a semantic method, terms are 
interpreted as an element in a well-founded ordered set. 

Definition 32 (monotone domain) 

Given a non-empty set Do, called a base domain, and a strict partial order 
(irreflexive transitive relation) >ff on Do, we recursively define the set D^ and 
the relation on D^, for every function type r = ti x • • • — >■ tq, as follows: 

Dr = {(/?: Dr^ X • • • X Drr, ~ t Drg \ p IS monotone } 

where a function p S Dr is called monotone if 

p{xi, ... ,Xi,... ,Xn)>-ra ' ' ’ ’ 2 ^’ ' ' ' ’ 

for all i G {1, ... ,n} and xi G Dr .,, ... G Dr„,y G Dr, with Xi >~r. y. We 
write ip >~r ii G Dr and 

p{xi, ... ,Xn) ^ro • ,Xn) 

for all X\ G Dr, ,.. . ,Xn G Dr,, . 
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Lemma 33 (properties of 

Let be a strict partial order on a base domain Dg- 

(1) >-^ also is a strict partial order for all types r. 

(2) If is well-founded, then also is well-founded for all types r. 

Proof. By induction on r. □ 

We interpret every constant symbol by an algebra operation on a well- 
founded ordered set. A valuation gives an interpretation to variables. 

Definition 34 (monotone interpretation) 

Let be a strict partial order on a base domain Dg. A monotone interpretation 
I associates every constant symbol c of type r with its interpretation c/ G Dr- 
A valuation is a function which maps a variable symbol of type r to an element 
in Dr- 

Let / be a monotone interpretation and a be a valuation. A term t of type 
r is interpreted as an element of Dr as follows: If f is a variable symbol, then 
|t]a = a{t). If t is a constant symbol, then |t]o, = tj. If t = {toti ■ ■ - tn) then 
|t]a = |tola([tila, • ■ • , pn]a)- We define the relation >~i on the set of terms as 
follows: s tif and only if s and t are of the same type, say r, and |s]a ^r Pla 
for all valuations a. 

The order >-1 compares two terms by interpretation. If a well-founded order 
on terms is closed under contexts and substitutions, then I >- r for all rewrite 
rules I ^ r G TZ suffices to show the termination of TZ. It turns out that is 
closed under contexts and substitutions. 

Lemma 35 (properties of 

Let he a, strict partial order on a base domain Dg and / be a monotone 
interpretation. 

(1) also is a strict partial order. 

(2) If )^g is well-founded, then >-j also is well-founded. 

Proof. Easy consequences of Lemma m □ 



Lemma 36 (closure under contexts of yj) 

Let be a strict partial order on a base domain Dg and / be a monotone 
interpretation. Then, yj is closed under contexts, i.e., if s t then u[s]p 
u[t]p for all possible terms u and positions p in u. 

Proof. The proof is by induction on p. The case p = e is trivial. If 
p = iq, then u should have the form (uq u\ ■ ■ ■ Un) and uq has func- 
tion type, say r = ti x ■ ■ ■ x Tn — >■ tq. From the induction hypothe- 
sis, we know Ui[s]q Ui[t]q. We distinguish two cases. If i = 0, then 
|uo[s] 5 ]a |Mo[i](jla for all valuations a, by the induction hypothe- 
sis. Hence |zi[s]pJo, — * * * '^n)|a — I'^O [ 5 ] gja (I'^ll a ; • • • i |^n|a) ^ tq 

1^0 [t]gl q; (I'UiIq, , . . . , |u^Jq.) |ziQ[t]qr |'u[t]pJo, for all 01 . There- 
fore u[s]p u[t]p. Consider the case i > 0. By the induction hy- 
pothesis, we have for all valuations a. Hence, 

|'u[s]p|o, = |(uoUi ■ ■ ‘ ^2 [s]g * * * 'U72 )Jq = Ir^ola (|r^lla j ■ • ■ ; ; ■ ■ * [r^n|a) ^ tq 
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l^ol a ( 1^1 1 a j ■ ■ ■ ; [^] gja ; • - ■ : |^n| a) — K'^0^1 ’ ‘ ‘ [^]g ' ' ' ^n)|a — ck : 

all valuations a, by monotonicity. Therefore rt[s]p u[t]p. □ 



Lemma 37 (closure under substitutions of )^i) 

Let be a strict partial order on a base domain Do and / be a monotone 
interpretation. 

(1) |tcr]a = |t]a*cr for all terms t, substitutions a, and valuations a. Here the 
valuation a * cr is defined by (a * cr)(x) = |(r(a:)]o!. 

( 2 ) >~i is closed under substitutions, i.e., if s t then sa >~i ta for all substi- 
tutions a. 

Proof. (1) is by structural induction on t. (2) is an easy consequence of (1). □ 

The following theorem provides a semantic proof method for termination of 
STTRSs. 

Theorem 38 (termination of STTRSs by interpretation) 

Let TZ = {R, V, C) be an STTRS. If there exists a base domain Do, a well- 
founded strict partial order on Do, and a monotone interpretation I, such 
that |Z]q, |r]c for all rewrite rules I ^ r G R and valuations a, then TZ is 
terminating. 

Proof. It is easy to see that — >- 7 ^ C yj because we have p]c >~i |r]o, for all 
rewrite rules I — > r and valuations a by assumption, and the relation >~i is 
closed under contexts and substitutions by Lemmata EEl and I37L For the proof 
by contradiction, suppose is not well-founded. So, there is an infinite rewrite 
sequence to -^n ti • From this sequence and the inclusion C 

we obtain an infinite descending sequence to )^i t\ )^i ■ ■ ■ . This contradicts the 
well-foundedness of the relation >~i, which follows from Lemma ESI □ 



Example 39 (termination of STTRSs by interpretation) 

Consider again the STTRS TZ of Example 0 In order to prove the termination 
of TZ, define the base domain as Do = N — {0, 1}, and the order by using 
the standard order > on N as follows: m n if and only if m > n. Constant 
symbols are interpreted as follows: 



[]/ =2 

: i{m, n) = m + n 

mapj((/3, n) = nx ip{n) 

= ipfiffn)) + I 
twice/((p)(n) = Lp{ip{n)) + 2 



Note that []/ ^ Do, : i € Doxo^o, mapj e D(o^o)xo->o, °i G 
F>(o-»-o)x(o->-o)-».(o-)-o)i and twice/ G D(o_>.o)->.(o->-o) are satisfied. It is easy to see 
that p]o |r]c for all rewrite rules I ^ r G R and valuations a. Therefore TZ 
is terminating. 



The converse of Theorem EEI holds for the first order case f/ja,n94j . However, 
for STTRSs, the converse does not hold as shown in the following example. 
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Example 40 (incompleteness of the semantic proof method) 

Let C = {0°, the set of variable symbols V contain F°^°, and R = 

{g (F 0 1) — >■ g (F 1 0)}. It is not difficult to see that the STTRS (F, V, C) is 
terminating. 

Suppose there is a base domain Do, a well-founded strict partial order on 
Do, and a monotone interpretation I, such that |Z]„ >~i |r]a for all rewrite rules 
I ^ r G R and valuations a. We have g/(/(0/, 1/)) >~o g/(/(l/, 0/)) for all mono- 
tone functions / G Doxo^o- Let h be an arbitrary monotone function in Dqxo^o- 
Then, the function h' defined by h'{x,y) = h{y,x) is also monotone and hence 
h' G Doxo^o- Therefore, g/(/i(0/, 1/)) g/(ft.(l/, 0/)) = g/(F(0/,l/)) 

gi{h'{li, 0/)) = gi{h{0i, 1/)). This contradicts the irreflexivity of >~o- 

At present, it is not known whether there is a complete proof method for 
termination of STTRSs based on monotone interpretation. 



7 Concluding Remarks 

We have proposed simply typed term rewriting systems (STTRSs), which is close 
to the format of functional programming languages with pattern matching. For 
proving the confluence of orthogonal rewrite systems, we introduced the paral- 
lel moves property, which is a useful sufficient condition obtained by localizing 
the test for the diamond property of a parallel reduction. We proved the con- 
fluence of orthogonal STTRSs and orthogonal normal conditional STTRSs. We 
also provided a semantic method for proving termination of STTRSs based on 
monotone interpretation. 

Since the class of (conditional) STTRSs is a proper extension of the first-order 
case, all known results for the first-order TRSs can be applied to the subclass of 
our (conditional) STTRSs. We can also expect that many known results for the 
first-order TRSs can be lifted to the higher-order case without difficulty, because 
the behaviour of our higher-order extension is very close to that of the first-order 
(conditional) TRSs. 

Suzuki et al. gave a sufficient condition for the confluence of orthogonal first- 
order conditional TRSs possibly with extra variables in the right-hand sides 
of the rewrite rules |SMT95| . The author conjectures that their result can be 
extended to the higher-order case. 

Acknowledgments. I wish to thank Aart Middeldorp, Femke van Raamsdonk, 
and Fer-Jan de Vries for their comments on the preliminary version of this paper. 
I benefltted from a discussion with Jaco van de Pol and Hans Zantema, which 
resulted in Example My thanks are also due to the anonymous referees for 
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Abstract. We describe the MPINE tool, a multi-threaded evaluator for 
Interaction Nets. The evaluator is an implementation of the present au- 
thor’s Abstract Machine for Interaction Nets Q and uses POSIX threads 
to achieve concurrent execution. When running on a multi-processor ma- 
chine (say an SMP architecture), parallel execution is achieved effort- 
lessly, allowing for desktop parallelism on commonly available machines. 



Interaction Nets 

Interaction Nets |3] are a graph-rewriting formalism where the rewriting rules 
are such that only pairs of nodes, connected in a specific way, may be rewrit- 
ten. Because of this restriction, the formalism enjoys strong local confluence. Al- 
though the system has been introduced as a visual, simple, and inherently paral- 
lel programming language, translations have been given of other formalisms into 
Interaction Nets, specifically term-rewriting systems P and the A-calculus |21 
Ej. When used as an intermediate implementation language for these systems, 
Interaction Nets allow to keep a close control on the sharing of reductions. 

Interaction Nets have always seemed to be particularly adequate for being 
implemented in parallel, since there can never be interference between the re- 
duction of two distinct redexes. 

Moreover there are no global time or synchronization constraints. A parallel 
reducer for Interaction Nets provides a reducer for any of the formalisms that 
can be translated into these nets, without additional effort. 

We present here MPINE (for Multi-Processing Interaction Net Evaluator), a 
parallel reducer for Interaction Nets which runs on generic shared-memory multi- 
processors, based on the POSIX threads library. The system runs notably on the 
widely available SMP architecture machines running the Unix operating system. 



A Concurrent Abstract Machine for Interaction Nets 

In u the author has proposed an abstract machine for Interaction Net reduction, 
providing a decomposition of interaction steps into finer-grained operations. The 
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multi-threaded version of this machine is a device for the concurrent implementa- 
tion of Interaction Nets on shared-memory architectures, based on a generalized 
version of the producer-consumers model: basic machine tasks are kept on a 
shared queue, from which a number of threads take tasks for processing. While 
doing so, new tasks may be generated, which will be enqueued. 

Besides allowing for finer-grained parallelism than Interaction Nets them- 
selves, the decomposition of interaction also allows for improvements concerning 
the synchronization requirements of the implementation. In particular, it solves 
a basic deadlock situation which arises when one naively implements Interaction 
Nets in shared-memory architectures, and it lightens the overheads of synchro- 
nization - the number of mutexes required by each parallel operation is smaller 
than that required by a full interaction step, which may be quite significant. 

We direct the reader to ^ for details on these issues. 



The abstract machine may be implemented on any platform offering support 
for multi-threaded computation. MPINE, which uses POSIX threads, is, to the 
best of our knowledge, the first available parallel reducer for Interaction Nets. 

The program has a text-based interface. The user provides as input an inter- 
action net written in a language similar to that of P|, and the number of threads 
to be launched (which should in general be equal to the number of available pro- 
cessors in the target machine). The output of the program (if the input net has 
a normal form) is a description of the reduced net in the same language. 

We give a very simple example in which we declare three agents (Zero, Suc- 
cessor, Addition) that will allow us to implement the sum of Natural numbers 
by rewriting with Interaction Nets, using the usual inductive definition. 



This net represents the two equations S{0) + a = x and 5(0) -I- x = b, or 
simply 5(0) -I- (5(0) -I- a) = 6. Each cell has a unique principal port. For -|- cells 
this represents the first argument of the sum. The file example.net contains a 
description of the net together with the interaction system in which it is defined: 



MPINE 




agent s 



Z 

S 

A 



0 

1 

2 



rules 
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A(x,x) >< Z; 

A(x,S(y)) >< S(A(x,y)); 

net 

S(Z) = A(a,x) ; 

S(Z) = A(x,b) ; 
interface 
a; 
b; 

end 



The agents section contains declarations of agents, with their arity; then the 
rules and the interaction net are given as sequences of active pairs, written as 
equations. The file ends with the interface of the net, a sequence of terms. 
Interaction rules rewrite active pairs of cells connected through their principal 
ports. An example is given below, corresponding to the (term-rewriting) rule 
S{y)+x — > S{y+x). Interaction rules are written as pairs of terms by connecting 
together the corresponding free ports in both sides of each rule, as shown: 




The following is an invocation of the reducer with 4 threads, with the above file 
as input, followed by the result produced: 

> mpine -4 example.net 

==============Initial Net=============== 

Displaying net equalities. . . 

S(Z) = A(xO,xl) 

S(Z) = A(x2,x0) 

Displaying observable interface... 
x2 
xl 

===============Reduced Net ============== 

Displaying observable interface... 
xO 

S(S(xO)) 



The input net is printed followed by the reduced net. Since this is a net with no 
cycles, there are no equations left (if there were they would be displayed). The 
system may also be instructed to print some statistics, including the number of 
tasks performed by each individual thread. The reader is referred to the user’s 
guide for more information on additional features of the system. 
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Some Benchmark Results 

We show a set of benchmark results for the reduction of nets obtained from A- 
terms using the YALE translation of 0]. The terms are Church numerals, which 
we have selected simply because they generate abundant computations. 



term 


N.int 


seq.red 


par2red 


par2/seq % 


2232II 


37272 


5.02 


5.54 


110 


423II 


105911 


29.8 


15.7 


52.7 


333II 


473034 


552 


310 


56.2 


2233II 


1417653 


7096 


5381 


75.8 



For each term we show the number of interactions and the time taken to reduce 
the corresponding net, both by a sequential reducer (free of the synchronization 
overheads) and by MPINE running on a 2 processor machine. We also show the 
ratio of the two. These preliminary results are promising: in two of the above 
nets, the ideal goal of reducing by half the execution time is practically attained. 



Availability 

MPINE is written in C. The distribution contains a user’s guide and some ex- 
ample files. It is available as a statically linked binary for Linux i386 ELF from 
the author’s homepage. A sequential reducer is also available. 



Future Work 

Further optimizations for MPINE are proposed in |SI, which have yet to be 
incorporated in the implementation. It also remains to test the implementation 
with a large set of terms, notably in machines with more than two processors. 



References 

1. Maribel Fernandez and Ian Mackie. Interaction nets and term rewriting systems. 
Theoretical Computer Science, 190(l):3-39, January 1998. 

2. Georges Gonthier, Martin Abadi, and Jean-Jacques Levy. The geometry of optimal 
lambda reduction. In Proceedings of the 19th ACM Symposium on Principles of 
Programming Languages (POPL’92), pages 15-26. AGM Press, January 1992. 

3. Yves Lafont. Interaction nets. In Proceedings of the 17th ACM Symposium on Prin- 
ciples of Programming Languages (POPL’90), pages 95-108. AGM Press, January 
1990. 

4. Ian Mackie. YALE: Yet another lambda evaluator based on interaction nets. In 
Proceedings of the 3rd ACM SICPLAN International Conference on Functional Pro- 
gramming (ICFP’98), pages 117-128. ACM Press, September 1998. 

5. Jorge Sousa Pinto. Sequential and concurrent abstract machines for interaction 
nets. In Jerzy Tiuryn, editor, Proceedings of Foundations of Software Science and 
Computation Structures (FOSSACS), number 1784 in Lecture Notes in Computer 
Science, pages 267-282. Springer- Verlag, 2000. 

6. Jorge Sousa Pinto. Parallel Implementation with Linear Logic (Applications of Inter- 
action Nets and of the Ceometry of Interaction). PhD thesis, Ecole Polytechnique, 
2001 . 



Stratego: A Language for Program 
Transformation Based on Rewriting Strategies 
System Description of Stratego 0.5 



Eelco Visser 

Institute of Information and Computing Sciences, Universiteit Utrecht, 
P.O. Box 80089, 3508 TB Utrecht, The Netherlands 
visserOacm. org, http : //www. cs .uu.nl/~visser 



1 Introduction 

Program transformation is used in many areas of software engineering. Examples 
include compilation, optimization, synthesis, refactoring, migration, normaliza- 
tion and improvement m- Rewrite rules are a natural formalism for expressing 
single program transformations. However, using a standard strategy for normal- 
izing a program with a set of rewrite rules is not adequate for implementing 
program transformation systems. It may be necessary to apply a rule only in 
some phase of a transformation, to apply rules in some order, or to apply a 
rule only to part of a program. These restrictions may be necessary to avoid 
non-termination or to choose a specific path in a non-confluent rewrite system. 

Stratego is a language for the specification of program transformation sys- 
tems based on the paradigm of rewriting strategies. It supports the separation of 
strategies from transformation rules, thus allowing careful control over the ap- 
plication of these rules. As a result of this separation, transformation rules are 
reusable in multiple different transformations and generic strategies capturing 
patterns of control can be described independently of the transformation rules 
they apply. Such strategies can even be formulated independently of the object 
language by means of the generic term traversal capabilities of Stratego. 

In this short paper I give a description of version 0.5 of the Stratego system, 
discussing the features of the language (Section EJ, the library (Section^, the 
compiler (Section Ej) and some of the applications that have been built (Sec- 
tion E|) . Stratego is available as free software under the GNU General Public 
License from http : //www . stratego-language . org. 



2 The Language 

In the paradigm of program transformation with rewriting strategies H3 a spec- 
ification of a program transformation consists of a signature, a set of rules and 
a strategy for applying the rules. The abstract syntax trees of programs are rep- 
resented by means of first-order terms. A signature declares the constructors of 
such terms. Labeled eonditional rewrite rules of the form L: 1 -> r where s, 

A. Middeldorp (Ed.): RTA 2001, LNCS 2051, pp. 357-^^] 2001. 
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module lambda-transform 

imports lambda-sig lambda-vars iteration simple-traversal 
rules 

Beta : App(Abs(x, el), e2) -> <lsubs>( [(x,e2)] , el) 
strategies 

simplify = bottomup (try (Beta) ) 

eager = rec eval(try(App(eval, eval)); try(Beta; eval)) 
whnf = rec eval(try(App(eval, id)); try(Beta; eval)) 



Fig. 1. A Stratego module defining several strategies for transforming lambda expres- 
sions using beta reduction. Strategy simplify makes a bottom-up traversal over an 
expression trying beta reduction at each subexpression once, even under lambda ab- 
stractions. Strategy eager reduces the argument of a function before applying it, but 
does not reduce under abstractions. Strategy whnf reduces an expression to weak head- 
normal form, i.e., does not normalize under abstractions or in argument positions. 
Strategies eager and whnf use the congruence operator App to traverse terms of the 
form App(el,e2), while strategy simplify uses the generic traversal bottomup. The 
strategy Isubs is a strategy for substituting expressions for variables. It is implemented 
in module lambda-vars using a generic substitution strategy. 



with 1 and r patterns, express basic transformations on terms. A rewriting strat- 
egy combines rules into a program that determines where and in what order the 
rules are applied to a term. An example specification is shown in Figure E 

A strategy is an operation that transforms a term into another term or fails. 
Rules are basic strategies that perform the transformation specified by the rule 
or fail when either the subject term does not match the left-hand side or the con- 
dition fails. Strategies can be combined into more complex strategies by means 
of a language of strategy operators. These operators can be divided into opera- 
tors for sequential programming and operators for term traversal. The sequential 
programming operators identity (id) , failure (fail), sequential composition (;), 
choice (+), negation (not), test, and recursive closure (rec x(s)) combine strate- 
gies that apply to the root of a term. To achieve transformations throughout a 
term, a number of term traversal primitives are provided. For each construc- 
tor C/n, the corresponding congruence operator C(sl,...,sn) expresses the 
application of strategies to the direct sub-terms of a term constructed with C. 
Furthermore, a number of term traversal operators express generic traversal to 
the direct sub-terms of a term without reference to the constructor of the term. 
These constructs allow the generic definition of a wide range of traversals over 
terms. For example, the strategy all(s) applies s to each direct sub-term of 
a term. Using this operator one can define bottomup (s) = rec x(all(x); s), 
which generically defines the notion of a post-order traversal that visits each 
sub-term applying the parameter strategy s to it. 

A number of abstraction mechanisms are supported. A strategy definition 
of the form f(xl,...,xn) = s defines the new operator f with n parameters 
as an abstraction of the strategy s. An overlay of the form C(xl, . . . ,xn) = t 
captures the pattern t in a new pseudo-constructor C 0 . Constructors and strat- 
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egy operators can be overloaded on arity. Strategies implemented in a foreign 
language (e.g., for accessing the file system) can be called via the prim construct. 

The distinction between rules and strategies is actually only idiomatic, that 
is, rules are abbreviations for strategies that are composed from the actual prim- 
itives of transformation: matching terms against patterns and building instantia- 
tions of patterns. Thus, a rule L: 1 -> r where s is just an abbreviation of the 
strategy! = {xl,...,xn: ?1; where (s); ! r}, where the xi are the variables 
used in the rule. The construct {xs : s} delimits the scope of the variables xs 
to the strategy s. The strategy ?t matches the subject term against the pattern 
t binding the variables in t to the corresponding sub-terms of the subject term. 
The strategy ! t builds an instantiation of the term pattern t by replacing the 
variables in t by the terms to which they are bound. Decoupling pattern match- 
ing and term construction from rules and scopes, and making these constructs 
into first-class citizens, opens up a wide range of idioms such as contextual rules 
and recursive patterns jO] . In these idioms a pattern match is passed on to a local 
traversal strategy to match sub-terms at variable depth in the subject term. 

Finally, specifications can be divided into modules that can import other 
modules. The above constructs of Stratego together with its module system 
make a powerful language that supports concise specification of program trans- 
formations. An operational semantics of System S, the core of the language, can 
be found in j1 311 4| . A limitation of the current language is that only a weak type 
system is implemented. Work is in progress to find a suitable type system that 
reconciles genericity with type safety. 

3 The Library 

The Stratego Library m is a collection of modules (~45) with reusable rules 
(~130) and strategies (~300). Included in the library are strategies for sequential 
control, generic traversal, built-in data type manipulation (numbers and strings), 
standard data type manipulation (lists, tuples, optionals), generic language pro- 
cessing, and system interfacing (I/O, process control, association tables). 

The generic traversal strategies include one-pass traversals (such as topdown, 
bottomup, oncetd, and spinetd), fixed point traversal (such as reduce, inner- 
most, and outermost), and traversal with environments. The generic language 
processing algorithms cover free variable extraction, bound variable renaming, 
substitution, and syntactic unification m These algorithms are parameterized 
with the pattern of the relevant object language constructs and use the generic 
traversal capabilities of Stratego to ignore all constructs not relevant for the op- 
eration. For example, bound variable renaming is parameterized with the shape 
of variables and the binding constructs of the language. 

4 The Compiler 

The Stratego Compiler translates specifications to C code. The run-time sys- 
tem is based on the ATerm library which supports the ATerm Format, a 



360 



E. Visser 



representation for first-order terms with prefix application syntax. The library 
implements writing and reading ATerms to and from the external format, which 
is used to exchange terms between tools. This enables component-based devel- 
opment of transformation tools. For example, a Stratego program can transform 
abstract syntax trees produced by any parser as long as it produces an ATerm 
representation of the abstract syntax tree for a program. 

The compiler has been bootstrapped, that is, all components except the 
parser are specified in Stratego itself. The compiler performs various optimiza- 
tions, including extracting the definitions that are used in the main strategy, 
aggressive inlining to enable further optimizations and merging of matching pat- 
terns to avoid backtracking. A limitation of the current compiler is that it does 
not support separate compilation and that compilation of the generated code by 
gcc is rather slow, resulting in long compilation times (e.g., 3 minutes for a large 
compiler component). Overcoming this limitation is the focus of current work. 

5 Applications 

Stratego is intended for use in a wide range of language processing applications 
including source-to-source transformation, application generation, program op- 
timization, compilation, and documentation generation. It is not intended for 
interactive program transformation or theorem proving. 

Examples of applications that use Stratego are XT, CodeBoost, HSX and 
a Tiger compiler. XT is a bundle of program transformation tools |S| in which 
Stratego is included as the main language for implementing program transforma- 
tions. The bundle comes with a collection of grammars for standard languages 
and many tools implemented in Stratego for generic syntax tree manipulation, 
grammar analysis and transformation, and derivation of tools from grammars. 
CodeBoost is a framework for the transformation of C-|— I- programs 0 that is 
developed for domain-specific optimization of C-|— I- programs for numerical ap- 
plications. HSX is a framework for the transformation of core Haskell programs 
that has been developed for the implementation of the warm fusion algorithm 
for deforesting functional programs [5|. The Tiger compiler translates Tiger pro- 
grams P to MIPS assembly code m The compiler includes translation to 
intermediate representation, canonicalization of intermediate representation, in- 
struction selection, and register allocation. 

6 Related Work 

The creation of Stratego was motivated by the limitations of a fixed (innermost) 
strategy for rewriting, in particular based on experience with the algebraic spec- 
ification formalism ASF-I-SDF J2]. The design of the strategy operators was in- 
spired by the strategy language of ELAN 0 , a specification language based on 
the paradigm of rewriting logic . For a comparison of Stratego with other sys- 
tems see H3C3. A survey of program transformation systems in general can be 
found in nni. The contributions of Stratego include: generic traversal primitives 
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that allow definition of generic strategies; break-down of rules into primitives 
match and build giving rise to first-class pattern matching; many programming 
idioms for strategic rewriting; bootstrapped compilation of strategies; a foreign 
function interface; component-based programming based on exchange of ATerms. 
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